Skip to content

feat(buy-x402): --set-default — agent self-adopts paid/<model> as its primary model#595

Closed
bussyjd wants to merge 1 commit into
mainfrom
feat/buy-x402-set-default
Closed

feat(buy-x402): --set-default — agent self-adopts paid/<model> as its primary model#595
bussyjd wants to merge 1 commit into
mainfrom
feat/buy-x402-set-default

Conversation

@bussyjd
Copy link
Copy Markdown
Contributor

@bussyjd bussyjd commented Jun 4, 2026

What

Adds --set-default to buy.py buy <name> --model <id>. After a persistent inference buy publishes the paid model in LiteLLM as paid/<remote-model>, the obol-agent adopts it as its own primary chat model, in-pod, by itself — with no host-side obol model prefer/obol model sync, no new RBAC, no controller change, no LiteLLM mutation, and no pod restart.

This closes the last manual step in "agent buys paid inference → agent uses it": previously a human had to run obol model prefer paid/<id> && obol model sync from the host.

Mechanism

  • On a successful inference buy (PurchaseRequest Ready), the agent runs the native hermes config set model.default paid/<id> — an atomic write to its own /data/.hermes/config.yaml (the PVC it already owns rw). Hermes resolves model.default per request (_resolve_gateway_model), and the atomic inode swap busts its mtime-keyed config cache, so the change takes effect on the next chat turn with no restart.
  • Existence guard: queries LiteLLM /v1/models and refuses if paid/<id> isn't selectable (never points the agent at an unpublished model).
  • Auto-refill safety: --set-default without --auto-refill warns loudly — once a paid model is the primary, every turn fails when the pre-signed pool empties.
  • Fallback: if the hermes binary path moves or the call fails, a PyYAML edit of config.yaml preserves sibling keys.

Why this design (validated by a 10-agent design + adversarial-validation workflow): it needs none of the permissions the agent lacks (kubectl auth can-i confirms the agent cannot patch configmaps/deployments or exec pods), and it reuses Hermes' own config writer + per-request model resolution instead of reordering LiteLLM's model_list.

Test results — live CLI smoke against a running obol-agent

Cluster eager-airedale; seller aeon7 = AEON-7/Qwen3.6-35B-A3B-heretic-NVFP4 on Ethereum mainnet (OBOL/Permit2); agent wallet 0xf441…F373.

  1. Buy + set-default (CLI, in-pod):
    buy.py buy aeon7 --endpoint https://inference.v1337.org/services/aeon7 \
      --model AEON-7/Qwen3.6-35B-A3B-heretic-NVFP4 \
      --count 8 --auto-refill --refill-threshold 3 --refill-count 8 --set-default
    
    → PurchaseRequest Ready, controller published paid/AEON-7/…, auto-refill armed → Agent default model set to 'paid/AEON-7/Qwen3.6-35B-A3B-heretic-NVFP4' (effective next chat turn; no restart).
  2. Config write confirmed: /data/.hermes/config.yamlmodel.default: paid/AEON-7/Qwen3.6-35B-A3B-heretic-NVFP4 (siblings api_key/base_url/provider preserved), via the native hermes config set path (no fallback, guard passed).
  3. Live-chat proof: a chat through the agent with no model specified (so it used its default) drove the x402-buyer pool spent 0 → 1, remaining 8 → 7 on aeon7 — a real on-chain settlement via paid/AEON-7, proving the new default took effect on the next turn with no restart.
  4. Rollback verified: hermes config set model.default qwen36-deep reverts cleanly.

Spend: 0.008 OBOL (8 auths) + 0.001 OBOL (1 chat).

Build / static

  • python3 -m py_compile internal/embed/skills/buy-x402/scripts/buy.py
  • go build ./cmd/obol ✓ (embed intact)
  • repo pre-commit Python checks ✓

Notes / follow-ups

  • The chat response's cosmetic model label still reflects the static API_SERVER_MODEL_NAME (e.g. qwen36-deep); it does not govern routing — the x402-buyer spend is the ground truth. Cosmetic-only.
  • Auto-refill needs a scheduled runner (buy.py process --all on a cron/heartbeat) so a paid primary model stays funded. The flag records intent and the controller honors it on reconcile, but nothing currently invokes process --all on a schedule for the default agent. Tracked as a follow-up.

Scope

Two files, additive: internal/embed/skills/buy-x402/scripts/buy.py (+159) and internal/embed/skills/buy-x402/SKILL.md (+2). No changes to the controller, RBAC, LiteLLM config, or any Go code.

…l> as primary

After a persistent inference buy publishes paid/<remote-model> in LiteLLM, the
agent adopts it as its own primary chat model in-pod via native
'hermes config set model.default' (atomic write, per-request re-read, no restart,
no host CLI, no new RBAC). Includes a LiteLLM /v1/models existence guard, an
auto-refill safety warning, and a PyYAML fallback writer.

Validated by a design+adversarial workflow and a live CLI smoke against a running
obol-agent: buy --set-default flips config.yaml model.default to paid/AEON-7/... and
the next agent chat settled via the x402-buyer pool (spent 0->1) with no restart;
rollback verified.
@OisinKyne
Copy link
Copy Markdown
Contributor

I wonder does this / can this do a throw away one shot pay test of inference, before it sets it as its litellm brain?

@bussyjd
Copy link
Copy Markdown
Contributor Author

bussyjd commented Jun 5, 2026

Superseded by the consolidated integration PR #598, which merges this branch together with the other PRs ≥ #593 into integration/pr-593-plus for a single reviewable diff. Review hardening fixes ride in #599 (targets the integration branch). Closing in favour of the consolidated review — reopen if you'd prefer to land this one standalone.

@bussyjd bussyjd closed this Jun 5, 2026
OisinKyne added a commit that referenced this pull request Jun 5, 2026
…ning) (#600)

* Work on openapi spec for services

* Improve 402 html page

* Update storefronts

* feat(buy-x402): add --set-default so the agent self-adopts paid/<model> as primary

After a persistent inference buy publishes paid/<remote-model> in LiteLLM, the
agent adopts it as its own primary chat model in-pod via native
'hermes config set model.default' (atomic write, per-request re-read, no restart,
no host CLI, no new RBAC). Includes a LiteLLM /v1/models existence guard, an
auto-refill safety warning, and a PyYAML fallback writer.

Validated by a design+adversarial workflow and a live CLI smoke against a running
obol-agent: buy --set-default flips config.yaml model.default to paid/AEON-7/... and
the next agent chat settled via the x402-buyer pool (spent 0->1) with no restart;
rollback verified.

* Have agent stream responses to keep the tunnel alive

* security(x402): SRI-pin the Scalar bundle on the public /api page

The /api OpenAPI reference is served over the public tunnel and pulls the
@scalar/api-reference bundle from jsdelivr. The integrity hash was left empty
in phase 1, so the browser executed whatever the CDN returned, unverified.

Populate scalarBundleSRI with the sha384 of the pinned 1.34.0 bundle so a
tampered CDN response is blocked. Comment updated to stress the hash must be
re-derived in lockstep with every scalarBundleVersion bump.

* fix(buy-x402): run --set-default existence guard before auto-refill warning

The 'paid/<model> not selectable in LiteLLM' guard ran *after* the
no-auto-refill WARNING. A model that LiteLLM would refuse still printed a
scary 'every chat turn fails when the pool empties' warning describing a
primary-model failure mode that cannot occur when the default was never
switched. Reorder so we refuse first and only warn when we are actually
about to adopt the model.

* security(x402): sanitize ServiceOffer-sourced tokens in 402 copy-paste commands

spec.model.name and metadata.name flow from the ServiceOffer CR into
copy-pasteable 'obol buy inference ...' commands rendered on the public 402
page. A hostile or fat-fingered offer could smuggle shell metacharacters into
a command a reader might paste. Add sanitizeDisplayToken at the render
boundary: CR-sourced tokens must match the model-id/k8s-name charset
(^[A-Za-z0-9._:/-]+$) or collapse to the existing safe placeholder. Real ids
like qwen3.5:9b and anthropic/claude-3-5-sonnet-latest pass through unchanged.

* docs(claude): compress and correct CLAUDE.md; fold in #597 streaming/sell-agent facts

Terse rewrite of project CLAUDE.md (42725 -> 41797 bytes) corrected against the live codebase. Preserves all invariants, the 14 pitfalls, and flag warnings; adds #597's stream:true / statusRecorder.Flush guidance and agent-backed-offer (port 8642) facts so the compressed doc loses nothing rc12 ships.

---------

Co-authored-by: Oisín Kyne <oisin@obol.tech>
Co-authored-by: bussyjd <jd@obol.tech>
Co-authored-by: bussyjd <bussyjd@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants