Skip to content

Profiles and Agents

KPH edited this page Jun 24, 2026 · 1 revision

Profiles & Agents

SandboxProfile

Profiles use a Kubernetes-style schema at klankermaker.ai/v1alpha1. Here's the goose profile - provisions a Goose agent sandbox with Bedrock, OTEL, hibernation, EFS, GitHub repo allowlisting, Slack notifications, and the eBPF gatekeeper:

apiVersion: klankermaker.ai/v1alpha1
kind: SandboxProfile
metadata:
  name: goose
  prefix: gebpfgk

spec:
  lifecycle:
    ttl: "4h"
    idleTimeout: "1h"
    teardownPolicy: stop

  runtime:
    substrate: ec2
    spot: false
    instanceType: t3.medium
    region: us-east-1
    rootVolumeSize: 15
    hibernation: true              # preserve RAM state on pause (on-demand only)
    mountEFS: true
    efsMountPoint: /shared
    additionalVolume:
      size: 20
      mountPoint: /data

  execution:
    shell: /bin/bash
    workingDir: /workspace
    useBedrock: true               # SigV4 auth via AWS Bedrock
    privileged: false
    env:
      GOOSE_PROVIDER: aws_bedrock
      GOOSE_MODEL: us.anthropic.claude-opus-4-6-v1
    configFiles:
      "/home/sandbox/.claude/settings.json": |
        {"trustedDirectories":["/home/sandbox","/workspace"]}
    initCommands:
      - "yum install -y git nodejs npm python3 jq tmux"
      - "npm install -g @anthropic-ai/claude-code"

  budget:
    compute: { maxSpendUSD: 0.50 }
    ai:      { maxSpendUSD: 1.00 }
    warningThreshold: 0.80

  network:
    enforcement: both
    egress:
      allowedDNSSuffixes:
        - .amazonaws.com
        - .anthropic.com
        - .github.com
        - .githubusercontent.com
        - .npmjs.org
        - .pypi.org

  sourceAccess:
    mode: allowlist
    github:
      allowedRepos: [my-org/api, my-org/infra]
      allowedRefs:  [main, "feature/*", "fix/*"]

  identity:
    roleSessionDuration: "1h"
    allowedRegions: [us-east-1]
    sessionPolicy: minimal

  cli:
    notifyEmailEnabled: true
    notifySlackEnabled: true
    notifySlackPerSandbox: true
    notifySlackInboundEnabled: true
    notifySlackTranscriptEnabled: true

  observability:
    claudeTelemetry:
      enabled: true
      logPrompts: true
      logToolDetails: true
    tlsCapture:
      enabled: true
      libraries: [openssl]

  email:
    signing: required
    verifyInbound: required
    encryption: required

Profiles support inheritance via extends - start from a base and override what you need. See docs/profile-reference.md for the full schema.


Non-Interactive Agent Execution

Run Claude (or any agent) non-interactively inside a sandbox. Prompts dispatched via SSM SendCommand, agents run in persistent tmux sessions that survive disconnects, output stored on disk + S3 for fast retrieval.

# Fire-and-forget - agent runs in tmux, returns immediately
km agent run sb-abc123 --prompt "fix the failing tests"

# Wait for completion - blocks until done, prints JSON result
km agent run sb-abc123 --prompt "What model are you?" --wait

# Interactive - attach to tmux, watch Claude work live (Ctrl-B d to detach)
km agent run sb-abc123 --prompt "refactor auth module" --interactive

# Attach to a running agent's tmux session
km agent attach sb-abc123

# Fetch results (S3 fast path, ~3s)
km agent results sb-abc123 | jq '.result'
km agent results sb-abc123 | jq '.total_cost_usd'

# List all runs with status
km agent list sb-abc123

# Use direct Anthropic API instead of Bedrock
km agent run sb-abc123 --prompt "..." --no-bedrock --wait

Profile defaults: Set spec.cli.noBedrock: true to default to direct API. Use spec.execution.configFiles to pre-seed Claude settings (trusted directories, etc.).

Codex parity: the same --prompt / --wait / --interactive flags work for Codex via --codex, with notify hooks (Slack, email) and inbound dispatch wired identically.


Scheduling and Recurring Operations

km at (alias km schedule) is the cron-like layer for sandbox operations. Backed by EventBridge Scheduler, persisted in DynamoDB.

# One-shot: create a sandbox at 10pm tomorrow
km at '10pm tomorrow' create profiles/goose.yaml --alias nightly

# Recurring: kill nightly sandbox every weekday at 11pm
km at 'every weekday at 11pm' kill nightly

# Schedule an agent run that auto-resumes a paused sandbox first
km at '6am tomorrow' agent run nightly --prompt "pull main, run tests" --auto-start

# Top up a budget on a schedule
km at 'every monday at 9am' budget-add nightly --ai 5.00

# Manage schedules
km at list
km at cancel my-nightly-tests

Supported subcommands: create, destroy/kill, stop, pause, resume, extend, budget-add, agent run. Same dispatch model as the CLI - the schedule fires a Lambda that invokes the same handler km would.


AMI Lifecycle

Sandboxes can be baked into private AMIs for fast cold starts of tuned environments.

km shell --learn --ami <sandbox>     # bake on shell exit; AMI ID written into learned profile
km ami list                          # operator-baked AMIs with profile references and size
km ami bake <sandbox>                # snapshot a running sandbox into an AMI
km ami copy <ami-id> --region <dst>  # copy AMI to another region in the same account
km ami delete <ami-id>               # deregister and delete EBS snapshots atomically

AMIs are private to the application AWS account (no LaunchPermission set), live in a single region until copied, and are surfaced as a WARN in km doctor when older than doctor_stale_ami_days (default 30) and unreferenced by any profile or running sandbox. spec.runtime.ami accepts both slugs (amazon-linux-2023, ubuntu-24.04, ubuntu-22.04) and raw AMI IDs (ami-xxxxxxxx); the compiler auto-rotates additionalVolume off /dev/sdf if a baked AMI already claims it.


Clone this wiki locally