Skip to content

docs: add ECS Fargate Spot reference architecture#764

Closed
chaodu-agent wants to merge 10 commits into
mainfrom
docs/refarch-ecs-fargate-spot
Closed

docs: add ECS Fargate Spot reference architecture#764
chaodu-agent wants to merge 10 commits into
mainfrom
docs/refarch-ecs-fargate-spot

Conversation

@chaodu-agent
Copy link
Copy Markdown
Collaborator

Summary

Add docs/refarch/aws-ecs-fargate-spot.md — a step-by-step guide for deploying OpenAB on AWS ECS Fargate Spot at ~$2.7/month.

What it covers

  • Architecture diagram (init container + main + sidecar pattern)
  • Cost breakdown
  • 7 deployment phases: secrets, IAM, infra, config, task def, auth, verify
  • S3-based auth persistence across Spot interruptions
  • Key gotchas (file ownership, memory sizing, no NAT needed)

Motivation

ECS Fargate Spot is the cheapest AWS option for running a single OpenAB bot. This refarch enables users to prompt their coding CLI with:

"per docs/refarch/aws-ecs-fargate-spot.md, run an openab on ECS for me"

and have the agent handle the full deployment autonomously.

@chaodu-agent chaodu-agent requested a review from thepagent as a code owner May 6, 2026 23:55
@github-actions github-actions Bot added pending-maintainer pending-screening PR awaiting automated screening closing-soon PR missing Discord Discussion URL — will auto-close in 3 days labels May 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

⚠️ This PR is missing a Discord Discussion URL in the body.

All PRs must reference a prior Discord discussion to ensure community alignment before implementation.

Please edit the PR description to include a link like:

Discord Discussion URL: https://discord.com/channels/...

This PR will be automatically closed in 3 days if the link is not added.

@shaun-agent
Copy link
Copy Markdown
Contributor

OpenAB PR Screening

This is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Click 👍 if you find this useful. Human review will be done within 24 hours. We appreciate your support and contribution 🙏

Screening report ## Intent

PR #764 adds a new AWS reference architecture document for running a single OpenAB bot on ECS Fargate Spot. It is trying to solve a deployer/operator problem: OpenAB can be cheap to run, but there is no concrete AWS deployment guide that explains the required secrets, IAM, persistence, task layout, and Spot interruption handling.

The user-visible outcome is a copyable deployment playbook that an operator or coding agent can follow to provision OpenAB on AWS for roughly low monthly cost.

Feat

This is a docs improvement with deployment-architecture guidance.

Behaviorally, it does not change OpenAB runtime code. It adds docs/refarch/aws-ecs-fargate-spot.md, covering an ECS Fargate Spot deployment pattern using init, main, and sidecar containers, S3-backed auth persistence, IAM/secrets setup, task definition details, and verification steps.

Who It Serves

Primary beneficiaries:

  • Deployers who want a low-cost AWS deployment path
  • Agent runtime operators running a single OpenAB bot
  • Maintainers and reviewers who want a documented reference deployment
  • Coding agents asked to provision OpenAB from repo documentation

Rewritten Prompt

Add a new reference architecture doc at docs/refarch/aws-ecs-fargate-spot.md that explains how to deploy a single OpenAB bot on AWS ECS Fargate Spot.

The document should be practical enough for a human or coding agent to follow end to end. Include the target architecture, estimated monthly cost assumptions, required AWS services, secrets/IAM setup, infrastructure steps, task definition shape, auth persistence strategy, deployment verification, and known operational gotchas.

Keep the guide scoped to a single-bot, low-cost ECS Fargate Spot deployment. Clearly call out assumptions, failure modes from Spot interruption, and how auth state survives task replacement. Avoid implying this is the only production architecture.

Merge Pitch

This is worth advancing because it makes OpenAB easier to operate outside local/dev environments and gives deployers a concrete low-cost AWS path. The PR is low runtime risk because it only adds documentation.

The main reviewer concern should be accuracy: AWS pricing, IAM permissions, task definition details, S3 auth persistence, and security posture need careful review. A misleading deployment guide can create more support burden than no guide, especially if users copy it into production.

Best-Practice Comparison

OpenClaw principles that fit:

  • Explicit delivery routing is relevant if the doc describes how OpenAB receives and sends events in ECS.
  • Durable job persistence is relevant only if the deployment needs to preserve job state beyond auth/session data.
  • Isolated executions fit the containerized ECS model.
  • Retry/backoff and run logs are relevant as operational guidance, especially around Spot interruption and task restarts.

OpenClaw principles that may not fit directly:

  • Gateway-owned scheduling only matters if this deployment includes scheduled jobs or recurring agent tasks. For a basic bot deployment doc, it should not be forced.

Hermes Agent principles that fit:

  • Atomic writes for persisted state are relevant to S3 auth persistence and task interruption safety.
  • Fresh session per scheduled run is relevant only if the deployment runs scheduled tasks.
  • Self-contained prompts for scheduled tasks are relevant if the doc is intended for coding-agent-driven deployment.
  • File locking to prevent overlap may matter if multiple containers or tasks can write auth state.

Hermes Agent principles that may not fit directly:

  • Gateway daemon tick model is probably not central to this PR unless the OpenAB runtime being deployed uses scheduled polling internally.

Overall, the PR aligns with the “operator runbook” side of these systems: clear deployment ownership, durable state handling, isolated execution, and operational recovery. It should avoid overclaiming production readiness unless it also covers locking, concurrent task behavior, logs, retries, and restore procedures.

Implementation Options

Conservative option: merge as a standalone reference architecture doc after technical review. Keep the scope to ECS Fargate Spot, add caveats for pricing/security, and require reviewers to validate commands, IAM policy shape, and persistence assumptions.

Balanced option: merge the doc plus add a lightweight validation checklist or companion issue. The doc lands now, while follow-up work tracks tested Terraform/CDK examples, least-privilege IAM, and runtime-specific verification.

Ambitious option: turn the refarch into a maintained deployment package. Add Terraform or CDK, example ECS task definitions, IAM policies, secret templates, health checks, logging defaults, and automated docs validation where possible.

Comparison Table

Option Speed to ship Complexity Reliability Maintainability User impact Fit for OpenAB right now
Standalone reviewed doc High Low Medium Medium Medium Strong
Doc plus validation checklist/follow-ups Medium Medium Medium-High High High Strongest
Full maintained IaC package Low High High if tested Medium Very high Risky unless maintainers want ongoing ownership

Recommendation

Advance the balanced option.

Merge the reference architecture as documentation if the technical details check out, but require it to include explicit assumptions and caveats around AWS pricing, security, Spot interruption behavior, and auth persistence. Open follow-up work for tested IaC, least-privilege IAM examples, and operational validation.

That gives Masami or Pahud a mergeable next step without turning a docs PR into a full infrastructure product prematurely.

@chaodu-agent chaodu-agent force-pushed the docs/refarch-ecs-fargate-spot branch from 5592d12 to 8709e66 Compare May 7, 2026 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

closing-soon PR missing Discord Discussion URL — will auto-close in 3 days pending-maintainer pending-screening PR awaiting automated screening

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants