Skip to content

feat(aws): surface quota pressure in doctor#155

Merged
steipete merged 3 commits into
openclaw:mainfrom
jwmoss:feat/aws-capacity-readiness
May 25, 2026
Merged

feat(aws): surface quota pressure in doctor#155
steipete merged 3 commits into
openclaw:mainfrom
jwmoss:feat/aws-capacity-readiness

Conversation

@jwmoss
Copy link
Copy Markdown
Contributor

@jwmoss jwmoss commented May 25, 2026

Summary

  • extend existing AWS provider doctor readiness with non-mutating EC2 Spot and On-Demand vCPU Service Quotas checks
  • surface confirmed quota pressure as advisory capacity warnings with the quota code, applied limit, default vCPU requirement, and recommended class/type; unreadable Service Quotas are skipped as capacity=unknown
  • preserve the timeout=10s text field for direct provider doctor checks that already include a provider=... field
  • return brokered AWS readiness capacity checks from the Worker and document crabbox doctor --provider aws as the pre-warmup check for new or unfunded AWS accounts

Verification

  • go test ./...
  • npm run format:check --prefix worker
  • npm run check --prefix worker
  • npm run lint --prefix worker
  • npm test --prefix worker
  • npm run build --prefix worker
  • go build -trimpath -o bin/crabbox ./cmd/crabbox
  • bin/crabbox doctor --provider aws --json exited 0 and reported advisory Spot and On-Demand quota pressure with recommended class/type fallbacks

Redacted excerpt from the latest doctor run:

{
  "ok": true,
  "provider": "aws",
  "checks": [
    {
      "status": "ok",
      "check": "provider",
      "provider": "aws",
      "message": "provider=aws coordinator_secrets=ready",
      "details": {
        "coordinator_secrets": "ready",
        "provider": "aws"
      }
    },
    {
      "status": "warning",
      "check": "capacity",
      "provider": "aws",
      "message": "provider=aws capacity=quota_pressure default_class=beast default_needed_vcpus=192 default_type=c7a.48xlarge hint=lower_class_or_request_quota limit_vcpus=32 market=spot quota_code=L-34B43A08 recommended_class=standard recommended_type=c7a.8xlarge region=eu-west-1",
      "details": {
        "capacity": "quota_pressure",
        "default_class": "beast",
        "default_needed_vcpus": "192",
        "default_type": "c7a.48xlarge",
        "hint": "lower_class_or_request_quota",
        "limit_vcpus": "32",
        "market": "spot",
        "provider": "aws",
        "quota_code": "L-34B43A08",
        "recommended_class": "standard",
        "recommended_type": "c7a.8xlarge",
        "region": "eu-west-1"
      }
    },
    {
      "status": "warning",
      "check": "capacity",
      "provider": "aws",
      "message": "provider=aws capacity=quota_pressure default_class=beast default_needed_vcpus=192 default_type=c7a.48xlarge hint=lower_class_or_request_quota limit_vcpus=16 market=on-demand quota_code=L-1216C47A recommended_class=standard recommended_type=c7a.4xlarge region=eu-west-1",
      "details": {
        "capacity": "quota_pressure",
        "default_class": "beast",
        "default_needed_vcpus": "192",
        "default_type": "c7a.48xlarge",
        "hint": "lower_class_or_request_quota",
        "limit_vcpus": "16",
        "market": "on-demand",
        "provider": "aws",
        "quota_code": "L-1216C47A",
        "recommended_class": "standard",
        "recommended_type": "c7a.4xlarge",
        "region": "eu-west-1"
      }
    }
  ]
}

exit=0

  • git diff --check

Notes

  • Brokered AWS already preflights quota during provisioning; this PR surfaces the same signal earlier in doctor without creating provider resources.
  • warmup does not run doctor automatically.
  • No new user secret is required; AWS Service Quotas reads use the existing provider credentials/policy surface.

@socket-security
Copy link
Copy Markdown

socket-security Bot commented May 25, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedgolang/​github.com/​aws/​aws-sdk-go-v2/​service/​servicequotas@​v1.34.798100100100100

View full report

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 25, 2026

Codex review: needs maintainer review before merge. Reviewed May 25, 2026, 6:52 AM ET / 10:52 UTC.

Summary
The PR adds AWS EC2 Spot and On-Demand vCPU Service Quotas capacity checks to direct and brokered crabbox doctor, updates Worker readiness responses, docs, tests, changelog, and the Go Service Quotas dependency.

Reproducibility: not applicable. this is a feature PR rather than a bug report. The PR body provides redacted live crabbox doctor --provider aws --json output, and the diff adds focused CLI and Worker tests for the new behavior.

Review metrics: 2 noteworthy metrics.

  • Diff surface: 20 files changed; 903 added, 45 removed. The change spans CLI, Worker, docs, tests, and dependency metadata, so maintainer review should consider both direct and brokered paths.
  • Dependency surface: 1 direct Go module added. The new AWS Service Quotas SDK module is the only new supply-chain surface introduced by the PR.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Next step before merge
No repair lane is needed; the PR already includes sufficient real behavior proof and this review found no discrete automation-fixable defect.

Security
Cleared: The diff adds non-mutating AWS Service Quotas reads and one AWS SDK service module; no secret handling, workflow, lifecycle-script, or permission broadening concern was found.

Review details

Best possible solution:

Land the PR after normal CI and maintainer review so AWS users can see quota pressure in doctor before warmup while keeping unreadable quotas advisory.

Do we have a high-confidence way to reproduce the issue?

Not applicable; this is a feature PR rather than a bug report. The PR body provides redacted live crabbox doctor --provider aws --json output, and the diff adds focused CLI and Worker tests for the new behavior.

Is this the best way to solve the issue?

Yes; surfacing the existing AWS quota preflight signal as advisory doctor readiness checks is a narrow maintainable path, and unreadable Service Quotas data is skipped instead of failing doctor.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against b391141b657c.

Label changes

Label changes:

  • add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

  • P2: This is a normal-priority AWS usability feature with bounded blast radius and existing proof/tests, not an urgent regression or release emergency.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes redacted after-fix live CLI output showing doctor --provider aws --json returning advisory quota warnings with exit=0.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes redacted after-fix live CLI output showing doctor --provider aws --json returning advisory quota warnings with exit=0.
Evidence reviewed

What I checked:

  • AGENTS.md reviewed: Read the full repository AGENTS.md and applied the provider-boundary, neutral positioning, testing, and security/config guidance during review. (AGENTS.md:1, b391141b657c)
  • Current main lacks the central doctor surface: A current-main search found no capacity=quota_pressure, CapacityDoctor, warning capacity, or doctor capacity readiness implementation, so the requested behavior is not already implemented on main. (b391141b657c)
  • Direct AWS doctor implementation: The PR adds CapacityDoctorChecks and Service Quotas GetServiceQuota calls that emit advisory capacity doctor checks, including skip behavior for unknown quota data. (internal/cli/aws.go:67, bb0c5fa97c7b)
  • Brokered doctor wiring: The PR passes full config context to coordinator provider readiness and records returned readiness checks without treating warnings or skips as doctor failures. (internal/cli/doctor.go:188, bb0c5fa97c7b)
  • Worker readiness implementation: The Worker readiness route now derives an AWS lease config from query parameters and returns Service Quotas capacity readiness checks when AWS is configured. (worker/src/fleet.ts:1363, bb0c5fa97c7b)
  • Regression coverage: The PR adds CLI and Worker tests covering brokered capacity warning propagation and Service Quotas readiness responses before a lease is requested. (worker/test/fleet.test.ts:5868, bb0c5fa97c7b)

Likely related people:

  • Peter Steinberger: Current-main blame and history tie provider doctor readiness, Worker Service Quotas helpers, AWS quota docs, and a recent Worker provider-capabilities refactor to this author. (role: feature owner and recent area contributor; confidence: high; commits: e1f2f9317a65, 141e294a2676; files: internal/cli/doctor.go, worker/src/aws.ts, worker/src/fleet.ts)
  • Vincent Koc: Git history for Service Quotas-related AWS work includes Mac host quota preflight and related quota changes adjacent to this doctor capacity surface. (role: adjacent AWS quota contributor; confidence: medium; commits: d5aa05526cbb, 8819f4c8f8a8, 241a2a1d2b41; files: worker/src/aws.ts, docs/features/capacity-fallback.md)
  • Ayaan Zaidi: Recent current-main AWS Worker maintenance touched worker/src/aws.ts, making this a plausible adjacent routing candidate for Worker-side AWS behavior. (role: recent AWS Worker contributor; confidence: medium; commits: 388c99fe94e7; files: worker/src/aws.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. labels May 25, 2026
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 25, 2026

ClawSweeper PR egg

✨ Hatched: ✨ glimmer Moonlit Proofling

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: ✨ glimmer.
Trait: guards the happy path.
Image traits: location release reef; accessory lint brush; palette amber, ink, and glacier blue; mood celebratory; pose leaning over a miniature review desk; shell brushed metal shell; lighting soft studio lighting; background smooth stones and checkmarks.
Share on X: post this hatch
Copy: My PR egg hatched a ✨ glimmer Moonlit Proofling in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 25, 2026
@clawsweeper clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 25, 2026
@steipete steipete merged commit 9fe0743 into openclaw:main May 25, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants