Skip to content

Update buy/sell skill to be more than just inference#403

Merged
OisinKyne merged 10 commits intomainfrom
oisin/buy-sell
May 1, 2026
Merged

Update buy/sell skill to be more than just inference#403
OisinKyne merged 10 commits intomainfrom
oisin/buy-sell

Conversation

@OisinKyne
Copy link
Copy Markdown
Contributor

@OisinKyne OisinKyne commented Apr 30, 2026

Summary

I did a first buy with hermes. Opus 4.7 had to work hard to find addresses, find bugs in remote signer it wasn't working around, etc.

I took the feedback and passed it to this pass.

Why it matters:

Biggest change is to remote-signer. I should have done a minor version rather than a patch but blame claude :)

Scope

x402 Buy/Sell Improvements — Change Summary

What we shipped

  1. Cross-repo: canonical Ethereum signatures from remote-signer

Fixed at the source rather than papered over in callers.

  • remote-signer (src/api/handlers.rs): sig_to_bytes now emits v ∈ {0x1b, 0x1c} instead of alloy's y-parity
    {0x00, 0x01}. Affects /sign/.../message, /sign/.../typed-data, and /sign/.../hash. /sign/.../transaction
    is unaffected (RLP-encoded type-2 txs use y-parity in the body, never go through sig_to_bytes). https://github.com/ObolNetwork/remote-signer/pull/9
  • remote-signer (tests/integration.rs): added canonical-v assertions across message/typed-data/hash plus a
    full ecrecover round-trip test that signs a known prehash and recovers back to the keystore address.
    Catches both wrong-byte-suffix and wrong-parity-bit regressions. 15/15 pass.
  • helm-charts (charts/remote-signer/Chart.yaml): chart 0.3.1 → 0.3.2, appVersion v0.2.0 → v0.2.1. Update remote-signer to v0.3.0 helm-charts#277
  • obol-stack (internal/agentruntime/charts.go): RemoteSignerChartVersion = "0.3.2" with comment explaining
    the canonical-v contract.
  • obol-stack (buy.py): dropped _normalize_signature_recovery workaround now that the signer is canonical
    at the source.
  • obol-stack (tests/test_buy_autorefill.py): regression test asserting the workaround stays gone —
    re-introducing it on a v0.2.1+ signer would double-add 27 and corrupt every payment.
  1. New pay command — single-shot HTTP purchases

The biggest functional gap from the friction report: buy is inference-shaped (requires --model, builds a
PurchaseRequest, publishes paid/ through the sidecar) and unusable for type:http services like
demo-hello.

  • buy.py pay [--type http|inference] [--method GET|POST] [--data BODY]: probes for 402 pricing,
    pre-signs one EIP-3009 (or Permit2) auth, attaches X-PAYMENT, sends the request, prints the response body
    and any X-PAYMENT-RESPONSE settlement metadata. Stateless. No CR, no sidecar. Max loss = price of one
    request.
  • Replaces ~80 lines of bespoke Python an agent previously had to generate from scratch.
  1. probe --type http for non-inference services

probe no longer hard-appends /v1/chat/completions and POSTs a chat body when the target is an HTTP
service.

  • buy.py probe --type http: sends a GET to the URL as-is, parses the 402, prints pricing.
  • probe output now distinguishes the human-readable token name (extra.name, "USD Coin") from the EIP-712
    signing domain (extra.eip712Domain, "USDC") so agents stop signing with the wrong domain.
  1. Skill rename: buy-inference → buy-x402

The skill now covers both flows (inference budget via buy, one-shot HTTP via pay). Mechanical rename
across CLAUDE.md, README, embed_skills_test.go, four cross-referencing SKILL.md files, openclaw
integration tests, three flow scripts.

  1. Machine-readable service catalog: /api/services.json

Already existed; agents now don't have to parse markdown to construct a payment.

  • internal/serviceoffercontroller/render.go: enriched ServiceJSON with caip2Network (e.g. eip155:84532),
    chainId, priceUnit (perRequest/perMTok/perHour), priceAtomicUnits (atomic units of asset), and a full asset
    block: {address, symbol, decimals, transferMethod, eip712Domain {name, version}}.
  • Backfills chain defaults when the seller didn't pass --token (USDC on Base/Base-Sepolia/Ethereum);
    explicit OBOL passes through unchanged with transferMethod: "permit2" and 18-decimal price math.
  • New tests TestBuildServiceCatalogJSON_AssetAndCAIP2Defaults and
    TestBuildServiceCatalogJSON_ExplicitOBOLToken lock in the schema agents depend on.
  • Schema follow-up: internal/schemas/service_catalog.go now owns the reusable catalog wire structs and embeds service-catalog.schema.json; renderer tests compile the schema and validate representative empty, HTTP, per-MTok, USDC-default, and explicit OBOL catalogs against it.
  1. Documentation gaps closed
  • buy-x402/SKILL.md: rewritten intro covering both pay and buy. New Gasless Payments section (no ETH
    needed; facilitator settles on-chain), new Facilitator section (server-side, agents never call it; lists
    the three networks eip155:1/8453/84532 covered by https://x402.gcp.obol.tech), new Pitfalls section
    (EIP-712 domain confusion, pay-vs-buy, prefer /api/services.json over markdown).
  • buy-x402/references/x402-buyer-api.md: explicit pitfall block on extra.name ≠ EIP-712 signing domain.
    Added facilitator chain-coverage table. Pointer to /api/services.json as the preferred machine-readable
    surface.
  • ethereum-networks/references/common-contracts.md: added Base Mainnet (8453) and Base Sepolia (84532)
    USDC sections with EIP-712 signing pitfall callout. OBOL kept on Ethereum mainnet only per current
    deployment.

Optional next steps

These are the items we deliberately scoped out — none of them block this PR set.

  1. Standalone x402_client.py (deferred from item 2.3) — Extract the EIP-712 typed-data construction,
    signature normalization, and X-PAYMENT envelope code from buy.py into a reusable module, so buy, pay, and
    any future client share one tested implementation. Easier to do now that pay exists and we know the
    boundary.
  2. HTTP method per service in the offer schema (item 1.6, you said you'd file an issue ) — Optional
    spec.upstream.method on ServiceOffer so HTTP services can advertise GET vs POST. Surfaces in the skill.md
    table and /api/services.json. Real fix for the "demo-hello rejects POST" friction.
  3. OBOL on Base / Base Sepolia — Once the OBOL contract is deployed there, add registry entries in
    internal/x402/tokens.go and the chain default in defaultUSDCForNetwork (which would generalize to a
    per-token resolver).
  4. EIP-7702 authorization signing — When/if we want this, it needs a separate remote-signer endpoint (or
    parameter) that returns y-parity (0/1) per EIP-7702 spec, not v=27/28. Don't reroute through sig_to_bytes.
    Worth a one-line note in the remote-signer README so a future maintainer doesn't accidentally extend the
    canonical path.
  5. Legacy EIP-155 transaction signing — sign_transaction only emits type-2 today. If we ever need to sign
    legacy txs for a non-1559 chain, that's a new endpoint with its own v = 35 + 2*chain_id + parity encoding.
  6. Resolve the EIP3009Name registry inconsistency — internal/x402/chains.go ChainBaseSepolia says
    EIP3009Name: "USD Coin" but the empirically-correct signing domain on Base Sepolia USDC is "USDC". buy.py
  • the catalog default both override to "USDC". Worth either fixing the registry value or dropping the
    field if buy.py/catalog are now the source of truth.
  1. flow-14-live-obol-base-sepolia.sh after deploy — Run the live commerce smoke once the chart bump rolls;
    that's the real proof that v=27/28 settles end-to-end through the facilitator.
  2. Schedule a follow-up agent — None of the above are time-bound, so I wouldn't propose a cron, but if you
    want a one-off agent to re-run the live smoke + check services.json schema once the
    helm-charts/remote-signer PRs merge, that's a reasonable /schedule candidate.

@OisinKyne
Copy link
Copy Markdown
Contributor Author

x402 Direct HTTP Purchase — Friction Report

An agent attempted to buy a demo-hello service (type: http, price: 0.00001 USDC)
from a remote Obol Stack instance on base-sepolia. No ETH for gas was available;
payment was entirely gasless via EIP-3009. The purchase succeeded, but only after
working around several documentation gaps, tooling blind spots, and format
mismatches. Below is everything that went wrong, grouped by severity.


Part 1: Fixes & Small Cleanups

These are concrete, low-effort changes that would eliminate the most common
failure modes for a less capable agent.

1.1 — USDC addresses missing from common-contracts.md

File: ethereum-networks/references/common-contracts.md

The common-contracts reference only lists mainnet tokens. The USDC address for
base-sepolia (0x036CbD53842c5426634e7929541eC2318f3dCF7e) and base
(0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913) are only found inside
buy-inference/scripts/buy.py (the USDC_CONTRACTS dict) and
buy-inference/references/x402-buyer-api.md.

An agent asked to "check your USDC balance on base-sepolia" will load
ethereum-networks first and find no base-sepolia section at all. It then has
to go hunting through other skills.

Fix: Add a ## Base Sepolia (Chain ID: 84532) section to
common-contracts.md with at least the USDC address, and similarly for Base
mainnet. These are the two payment chains the stack actually uses.

1.2 — EIP-712 domain name confusion: "USDC" vs "USD Coin"

The 402 response from the verifier returns extra.name: "USD Coin" and
extra.version: "2". The probe command prints this as eip712: USD Coin / 2.

But the correct EIP-712 domain name for signing on Base Sepolia USDC is
"USDC" (not "USD Coin"). The buy.py script hardcodes this correctly as
USDC_DOMAIN_NAME = "USDC", but a naive agent reading the probe output or the
402 body will use "USD Coin" and produce an invalid signature.

The extra.name field in the 402 response is the human-readable token name
(from the contract's name() function), not the EIP-712 domain name. These
happen to differ for Circle's USDC on Base Sepolia.

Fix (choose one or both):

  • Rename the 402 response extra fields to be unambiguous:
    extra.eip712DomainName and extra.eip712DomainVersion instead of the
    generic extra.name / extra.version.
  • Add a comment in the probe output: eip712 domain: USDC / 2 (signing name)
    vs token name: USD Coin (display only) so the distinction is clear.
  • At minimum, add a Pitfalls section to the buy-inference SKILL.md:

    The 402 extra.name is the token's human-readable name, NOT the EIP-712
    domain name used for signing. For USDC on Base Sepolia, sign with domain
    name "USDC", version "2". Do not use "USD Coin".

1.3 — Signature recovery normalization (v=0/1 → v=27/28)

The remote-signer sometimes returns signatures with recovery value v=0 or v=1
instead of the Ethereum-standard v=27 or v=28. The buy.py script handles this
with _normalize_signature_recovery(), but this function is buried inside
buy.py with no mention in any skill documentation.

An agent constructing a payment manually will get a signature with v=0 or v=1,
use it as-is, and the facilitator will reject it silently (the 402 just
re-triggers).

Fix: Document this in the ethereum-local-wallet SKILL.md or in the
remote-signer API reference:

Signatures from the remote-signer may use v=0/1 recovery format. Convert to
Ethereum v=27/28 before use: if the last byte is 0x00 or 0x01, add 27.

1.4 — Probe always appends /v1/chat/completions (inference assumption)

buy.py probe always appends /v1/chat/completions to the endpoint URL and
sends a POST with a chat-completions-shaped JSON body. This works for inference
services but is wrong for type: http services like demo-hello, which may only
accept GET on their root path.

The probe happens to work because the x402 verifier gate triggers before the
upstream sees the request, so you still get the 402 with pricing info. But it's
misleading — the printed endpoint shows /v1/chat/completions appended, which
is not the actual service URL.

Fix: Either:

  • Add a --type http flag to probe that skips the chat-completions path and
    uses GET.
  • Or detect from the 402 response (no model-related fields) and print a note:
    (Note: this appears to be an HTTP service, not inference. Use the base URL directly.)

1.5 — buy command requires --model (unusable for HTTP services)

The buy command requires --model as a mandatory argument. For type: http
services like demo-hello, there is no model. The entire buy→PurchaseRequest→
LiteLLM→sidecar pipeline is inference-specific.

An agent told to "buy demo-hello" cannot use buy.py buy at all for HTTP
services. It has to construct the X-PAYMENT header manually from scratch — sign
the EIP-712 data, base64-encode the envelope, and attach it to the request.
This is a significant amount of code for any agent to produce correctly.

This is the single biggest gap. See Part 2 for the overhaul suggestion.

1.6 — HTTP method not documented per service

The skill.md service catalog doesn't mention which HTTP methods each service
accepts. demo-hello returns 405 on POST but works on GET. An agent will
typically default to POST (since the inference path uses POST) and fail.

Fix: Add a Method column to the service catalog table, e.g.:

| Service | Type | Method | Price | Endpoint |
|---------|------|--------|-------|----------|
| demo-hello | http | GET | 0.00001 USDC | ... |

1.7 — The term "facilitator" is used but never defined with a URI

The buy-inference SKILL.md, x402-buyer-api.md, and x402-pricing.md all mention
a "facilitator" that settles payments on-chain. But no skill document ever
provides the facilitator URI or explains that the agent doesn't need to know it
— the seller-side verifier calls the facilitator internally.

An agent reading "client pays via facilitator" will search for a facilitator
endpoint to call directly and waste significant time.

Fix: Add a one-liner to the buy-inference SKILL.md:

The facilitator is server-side. The agent never calls it directly. The agent
signs a USDC authorization, sends it as X-PAYMENT, and the seller's
x402-verifier handles settlement via the facilitator. No ETH or gas is needed
from the buyer.


Part 2: Significant Overhaul Ideas

These are larger changes that would make x402 HTTP purchases seamless for any
agent, including models much less capable than Claude Opus.

2.1 — Add a pay command to buy.py for one-shot HTTP service purchases

The current buy.py is entirely inference-oriented (probe→buy→PurchaseRequest→
LiteLLM sidecar). For type: http services, none of that pipeline applies.

Proposal: Add a new command:

python3 scripts/buy.py pay <url> [--method GET|POST] [--data '{"key":"val"}'] [--count 1]

This would:

  1. Send a bare request to <url> to get the 402 pricing
  2. Pre-sign one (or N) EIP-3009 auth(s)
  3. Attach the X-PAYMENT header and re-send the request
  4. Print the response body and the settlement tx hash from x-payment-response
  5. If --count > 1, return all pre-signed auths for reuse

This single command replaces ~80 lines of manual Python that an agent currently
has to generate from scratch. It would make the purchase a one-liner:

python3 scripts/buy.py pay https://example.com/services/demo-hello

2.2 — Unify the skill.md catalog with machine-readable service metadata

Currently the tunnel's /skill.md is a nice human-readable table, but an agent
parsing it has to extract URLs, prices, and pay-to addresses from markdown. The
/.well-known/agent-registration.json exists but doesn't include pricing or
payment details — it's identity-only.

Proposal: Extend the skill.md generator to emit a machine-readable block
(JSON or YAML fenced block) at the bottom with the full service spec per
service:

{
  "services": [
    {
      "name": "demo-hello",
      "type": "http",
      "method": "GET",
      "endpoint": "https://...",
      "price": "10",
      "priceUnit": "micro-USDC",
      "payTo": "0x...",
      "network": "eip155:84532",
      "asset": "0x036...",
      "eip712Domain": {"name": "USDC", "version": "2"},
      "transferMethod": "eip3009"
    }
  ]
}

This eliminates all ambiguity about domain names, transfer methods, and
addressing for any consuming agent.

2.3 — A standalone "x402 client" script separate from the inference pipeline

Right now, the entire x402 payment construction knowledge lives inside buy.py,
entangled with PurchaseRequest CRDs, LiteLLM sidecar config, and Kubernetes
API calls. There's no reusable library or standalone script for "sign an EIP-3009
auth and make a paid HTTP request."

Proposal: Factor out a x402_client.py (or add it to ethereum-local-wallet)
that exposes:

# For scripts:
from x402_client import make_paid_request
resp = make_paid_request("https://..../demo-hello", method="GET")

# For CLI:
python3 scripts/x402_client.py https://..../demo-hello

This would be the low-level building block that both buy.py buy (inference)
and buy.py pay (HTTP) use internally. It keeps the EIP-712 domain logic,
signature normalization, and X-PAYMENT envelope construction in one tested
place instead of requiring every agent to re-derive it.

2.4 — Explicit "gasless payment" section in buy-inference SKILL.md

The fact that x402 EIP-3009 payments are gasless is the key insight that makes
the whole flow work without ETH. But this is never stated explicitly in any
skill. The closest is a passing mention of "facilitator settles" in the wire
format docs.

An agent with 0 ETH that's told "buy this service" will likely conclude it
can't transact and tell the user it needs gas money.

Proposal: Add a prominent section near the top of buy-inference SKILL.md:

Gasless Payments

x402 payments do NOT require ETH for gas. The agent signs a USDC
TransferWithAuthorization (EIP-3009) off-chain. The seller's facilitator
submits the on-chain settlement and pays gas. The agent only needs a USDC
balance — zero ETH is fine.


Summary of agent failure modes (in order of likelihood)

# Failure Root cause Where to fix
1 Can't find USDC address for base-sepolia Only in buy.py source code common-contracts.md
2 Signs with wrong EIP-712 domain name 402 extra.name is misleading Probe output + docs
3 Signature rejected (v=0/1 vs v=27/28) Undocumented remote-signer quirk wallet skill docs
4 Uses POST instead of GET for HTTP service No method info in catalog skill.md generator
5 Can't use buy command for HTTP services --model required, pipeline is inference-only New pay command
6 Thinks it needs ETH for gas Nowhere says payments are gasless SKILL.md section
7 Searches for facilitator URI to call "facilitator" mentioned but never explained SKILL.md clarification
8 Appends /v1/chat/completions to HTTP service URL Probe hardcodes inference path Probe --type flag

@OisinKyne OisinKyne requested a review from bussyjd April 30, 2026 23:16
@bussyjd
Copy link
Copy Markdown
Collaborator

bussyjd commented May 1, 2026

Schema enforcement follow-up

I added a follow-up commit on this PR to make the /api/services.json contract explicit instead of leaving it as private renderer structs.

Before

flowchart LR
  A["ServiceOffer CRD OpenAPI"] --> B["controller render.go"]
  C["internal/schemas input structs"] --> B
  B --> D["/api/services.json"]
  D -.-> E["buyers / agents"]
Loading
  • internal/schemas covered ServiceOffer/payment/registration authoring types.
  • The CRD OpenAPI schema enforced input/admission constraints like address shape, transfer method, asset decimals, path format, and required payment/upstream fields.
  • The public catalog wire shape lived locally in internal/serviceoffercontroller/render.go as ServiceJSON / ServiceAssetJSON / EIP712DomainJSON.
  • Tests validated important fields, including OBOL 18-decimal atomic math, but there was no reusable catalog schema artifact.

Fix

flowchart LR
  A["ServiceOffer CRD OpenAPI"] --> B["controller"]
  C["internal/schemas ServiceCatalogEntry"] --> B
  C --> D["embedded service-catalog.schema.json"]
  B --> E["/api/services.json"]
  E --> F["schema validation in renderer tests"]
  E --> G["buyers / agents"]
Loading
  • Added internal/schemas/service_catalog.go as the canonical Go wire model for the seller catalog.
  • Added embedded internal/schemas/service-catalog.schema.json for /api/services.json.
  • Moved the renderer to emit schemas.ServiceCatalogEntry / ServiceCatalogAsset / ServiceCatalogEIP712Domain instead of local private structs.
  • Added schema validation in renderer tests using github.com/santhosh-tekuri/jsonschema/v6.
  • The schema enforces exact public fields, rejects additional properties, validates EVM addresses, decimal strings, atomic-unit strings, CAIP-2 shape, chain IDs, asset decimals, and transfer method enum.
  • Representative catalog tests now validate empty, HTTP, per-MTok, Base/Base-Sepolia USDC default, and explicit OBOL Permit2 outputs against the schema.

Remaining boundary

This does not turn services.json into a CRD admission schema. CRD/input validation still lives in the Kubernetes OpenAPI CRD and Go input structs. The new check is a CI-level public contract guard for what sellers publish to agents.

Validated locally:

go test ./internal/schemas ./internal/serviceoffercontroller -count=1
git diff --check

@bussyjd
Copy link
Copy Markdown
Collaborator

bussyjd commented May 1, 2026

QA report: fresh isolated worktrees

What changed:

  • Re-ran the PR head from fresh, PR-specific QA worktrees on two QA machines with sudo access.
  • Used commit 38d50fffee310a5fa81a97a34f6c67b03cf53abf.
  • Kept cleanup scoped to recorded stack/worktree state; no broad host cleanup was used.

Why it matters:

  • This removes stale checkout ambiguity from the result.
  • The USDC seller/buyer path (flow-11-dual-stack) now has a clean end-to-end receipt set.
  • The full fleet is still not green; several failures remain and are listed below.

Risk level: medium

Commit under test: 38d50fffee310a5fa81a97a34f6c67b03cf53abf

Base branch: PR #403 head branch oisin/buy-sell

Scope

  • Code
  • Charts / manifests
  • Flows / QA scripts
  • Docs / skills
  • Images / dependencies
  • Other: live/fork flow selection and paid seller/buyer smoke

Validation

CI checks:

Check Status Link
GitHub CI Not re-read in this QA pass This comment covers manual remote QA only

Unit tests:

QA-A: bash -n flows/*.sh -> PASS
QA-A: go test ./... -count=1 -> PASS
QA-B: helm lint internal/embed/infrastructure/cloudflared -> PASS
QA-B: helm template cloudflared internal/embed/infrastructure/cloudflared | grep cloudflare/cloudflared -> PASS, cloudflare/cloudflared:2026.3.0
QA-B: go test ./cmd/obol ./internal/tunnel ./internal/stack -count=1 -> PASS

Integration tests:

QA-A: go test -tags integration -v -run TestBDDIntegration -timeout 25m ./internal/x402/ -> FAIL
  - Bootstrap reached stack startup and teardown.
  - Failure: serviceoffers.obol.org "bdd-test" not found.
  - Teardown also hit sudo-owned data cleanup: sudo password required while resolving seller wallet.

QA-A: go test -tags integration -v -timeout 60m ./internal/openclaw/ -> PASS

Flow tests:

Flow Network QA machine label Worktree Result Artifacts
flow-01-prerequisites local/Base Sepolia preflight QA-B fresh isolated PASS release-smoke log retained
flow-02-stack-init-up local k3d + Base Sepolia eRPC QA-B fresh isolated PASS release-smoke log retained
flow-03-inference local LiteLLM/Ollama QA-B fresh isolated PASS release-smoke log retained
flow-04-agent local Hermes QA-B fresh isolated FAIL agent inference, agent content, Hermes gateway health, dashboard deeplink failed
flow-05-network local eRPC/Base Sepolia config QA-B fresh isolated PASS release-smoke log retained
flow-06-sell-setup Base Sepolia x402 config QA-B fresh isolated PASS release-smoke log retained
flow-07-sell-verify Base Sepolia x402 config QA-B fresh isolated PASS release-smoke log retained
flow-10-anvil-facilitator Base Sepolia fork QA-B fresh isolated FAIL missing x402-facilitator binary
flow-08-buy Base Sepolia USDC public facilitator QA-B fresh isolated FAIL payment rejected, HTTP 503 settlement failed
flow-09-lifecycle local lifecycle QA-B fresh isolated PASS release-smoke log retained
flow-11-dual-stack Base Sepolia USDC public facilitator QA-B fresh isolated PASS registration, metadata, settlement receipt summary retained
flow-14-live-obol-base-sepolia Base Sepolia OBOL public facilitator QA-B fresh isolated FAIL Alice stack up failed before live OBOL purchase
flow-13-dual-stack-obol Anvil/fork optional QA-B fresh isolated PASS/SKIPPED preflight skipped because no X402_FACILITATOR_BIN or X402_RS_DIR

Release smoke:

RELEASE_SMOKE_INCLUDE_OBOL=true RELEASE_SMOKE_INCLUDE_OBOL_FORK=true bash flows/release-smoke.sh
Result: FAIL, 4 failing flows
Failing flows: flow-04-agent, flow-10-anvil-facilitator, flow-08-buy, flow-14-live-obol-base-sepolia
Passing critical USDC dual-stack flow: flow-11-dual-stack

Live Chain Evidence

Network: Base Sepolia (84532)

RPC/provider: https://sepolia.base.org

Facilitator: public x402 facilitator, base-sepolia exact support confirmed in preflight

Contracts and tokens:

Name Address Version / notes
OBOL 0x54AE82bc871a4E3E8E2FE1173Cb864B8563D44D4 name Obol Network, symbol OBOL, decimals 18; flow-14 failed before purchase
USDC Base Sepolia USDC flow-11 used micro-USDC settlement successfully

Wallet roles:

Role Address Source
Alice / seller / register 0xC0De030F6C37f490594F93fB99e2756703c4297E .env signer key
Bob / buyer / payer 0x57b0eF875DeB5A37301F1640E469a2129Da9490E deterministic derived buyer wallet

Balances:

Token Address Before After Expected delta Actual delta
USDC Alice 8,922,000 micro-USDC 8,923,000 micro-USDC +1,000 +1,000
USDC Bob signer 4,996,000 micro-USDC 4,995,000 micro-USDC -1,000 -1,000
OBOL Bob 999999006000000000000000 atomic units unchanged in this run live OBOL purchase not reached not applicable

Transaction receipts:

Purpose Tx hash From To Amount / event Status
ERC-8004 registration 0x14d2f489160c2199a07ff1898332a5aadea9ade22378cd594d11ab779df29004 Alice ERC-8004 registry agent id 5294 archived
Metadata / service offer 0x1d1826d680c86f4f21d04bb6512bd7b90db968ee7199a70dae8bf5d528fbc7fc Alice metadata path service offer metadata archived
Purchase request Kubernetes PurchaseRequest Bob agent Alice endpoint 5 auth tokens at 1000 micro-USDC each Ready=True
Settlement transfer 0x6a921042660b7419ce645d6646188d2a4cc557e488f2b66a5a687dc480e7b48c Bob signer Alice 1000 micro-USDC transfer archived and amount verified

Runtime Evidence

QA environment:

Item Value
OS / arch Ubuntu 24.04.4 LTS, arm64
Backend k3d/k3s
Tool versions Docker 29.2.1, k3d 5.8.3, k3s 1.31.5, kubectl 1.35.3, Helm 3.20.1, Go 1.25.1/1.25.3
QA agent/model Codex, GPT-5

Images:

Component Image Tag / digest Source
cloudflared cloudflare/cloudflared 2026.3.0 rendered Helm template
remote-signer chart pin 0.3.2 flow prerequisites check
OpenClaw / Hermes local images local cached/imported images current PR workspace build/cache release smoke logs

Kubernetes / stack:

Item Value
Stack IDs omitted from PR comment; retained in QA logs only
Namespaces stack, x402, llm, erpc, hermes namespaces exercised
Pod readiness flow-11 Alice/Bob x402 pods ready; flow-02 all pods healthy after retry
Cleanup result flow-11 stopped Alice and Bob stacks; additional scoped cleanup still recommended for the failed flow-14 cluster state

Model and routing:

Item Value
Agent/model used qwen3.5:9b, paid alias paid/qwen3.5:9b
LiteLLM route buyer sidecar auth found for alice-inference, remaining 5, spent 0 before paid inference
Paid endpoint status flow-11 paid inference returned HTTP 200 in 6.4s
Auth token source obol agent auth --runtime <runtime> obol-agent path exercised by flow

Artifacts and logs:

Artifact Location / link Notes
Release smoke report QA-B fresh worktree artifact directory contains per-flow pass/fail table
flow-11 receipt summary QA-B fresh worktree artifact directory contains registration, metadata, settlement tx hashes
QA-A integration log QA-A fresh worktree log directory contains x402 BDD failure and openclaw integration result

Demo readiness:

Item Status Notes
Seller visible / registered PASS for USDC flow-11 ERC-8004 registration reflected in ServiceOffer
Buyer discovery works PASS for USDC flow-11 agent discovery prompt led to PurchaseRequest
Paid route works PASS for USDC flow-11; FAIL for earlier single-stack flow-08; OBOL not reached dual-stack path is useful, single-stack buy still failing
Settlement visible on-chain PASS for USDC flow-11 settlement tx archived and balance deltas verified
Live OBOL seller/buyer FAIL before purchase flow-14 failed at Alice stack up due k3d existing-cluster/network mismatch after previous flow cleanup

Review Notes

Known gaps:

  • The full fleet is not green.
  • flow-14-live-obol-base-sepolia did validate OBOL token metadata and Bob OBOL balance, but failed before purchase because stack up attempted to start an existing k3d cluster whose Docker network was missing.
  • flow-11-dual-stack stopping stacks instead of deleting them appears insufficient before immediately running flow-14 in the same release-smoke sequence.
  • flow-13-dual-stack-obol is currently reported as PASS when it effectively skipped due missing local facilitator binary; this should be reported as SKIPPED/optional, not as a passing fork regression.
  • flow-10-anvil-facilitator needs an explicit facilitator build/path or should be excluded from default release smoke if the fork path is optional.
  • flow-04-agent still has Hermes/API gateway failures unrelated to the service schema rename.
  • flow-08-buy still fails settlement through the single-stack path with HTTP 503.
  • QA-A x402 BDD failed around missing bdd-test ServiceOffer and teardown requiring sudo for owned runtime data.

Follow-ups:

  • Add a release-smoke cleanup boundary between dual-stack flows, or make flow-11 delete its k3d clusters when the next flow needs a clean cluster with the same generated stack id.
  • Change optional fork flow reporting from PASS-on-skip to explicit SKIPPED.
  • Decide whether default release smoke should run only live Base Sepolia OBOL plus the known-green USDC dual-stack path, with Anvil/fork checks as named optional jobs.
  • Investigate flow-04-agent, flow-08-buy, and the x402 BDD fixture/teardown separately from this schema/unit rename PR.

Reviewer focus:

  • The schema/atomic-units changes are still useful and locally covered.
  • The PR should not be considered release-green until the full fleet failures above are either fixed or explicitly scoped out.

@bussyjd
Copy link
Copy Markdown
Collaborator

bussyjd commented May 1, 2026

Additional QA: explicit obol CLI lifecycle pass

I added a separate QA pass that drives the lifecycle through the obol CLI directly instead of relying only on release-smoke flow names.

Commit under test: 38d50fffee310a5fa81a97a34f6c67b03cf53abf

QA setup:

  • Fresh detached worktree on a QA machine with sudo access.
  • Built obol from the PR head inside that worktree.
  • No hostnames, secrets, or private paths included here; raw logs remain in the QA artifact directory.

Command coverage:

Area Commands exercised Result
Build/version go build ./cmd/obol, obol version PASS
Stack lifecycle obol stack init, obol stack up, obol kubectl get nodes, obol stack down PASS
Network lifecycle obol network list, obol network status, obol network add base-sepolia --count 1, obol network status PASS
Network remove obol network remove base-sepolia immediately after add FAIL: eRPC restart rate-limit; CLI says to wait because restart was already triggered within the past second
Negative validation obol network add base-sepolia --endpoint not-a-valid-url Correctly rejected invalid URL; harness counted non-zero as FAIL but CLI behavior is expected
Agent lifecycle obol agent init, obol agent list, obol agent wallet list obol-agent, obol agent auth obol-agent PASS
Seller pricing obol sell pricing --wallet ... --chain base-sepolia, obol sell status PASS
ServiceOffer lifecycle obol sell http ... --no-register, obol sell list, obol sell status, obol sell test, obol sell update, obol sell stop, obol sell delete PASS for create/list/status/test/update/stop/delete
Updated-price visibility obol sell status <name> after obol sell update --per-request 0.002 CLI returned Ready conditions but did not show the new price in the status output; verification failed looking for price visibility
Tunnel lifecycle obol tunnel status PASS
Tunnel logs obol tunnel logs --tail 20 FAIL: --tail is not a supported flag; QA harness command was wrong or CLI lacks expected tail support
Delete verification obol kubectl get serviceoffer <name> after delete Correctly returned NotFound; harness counted non-zero as FAIL but deletion behavior is expected
Purge cleanup obol stack purge --force after down FAIL/timeout at sudo-owned data removal; cluster containers/config were removed, but data cleanup needed sudo credentials

Summary from the raw lifecycle harness:

31 commands/checks total
25 harness PASS
6 harness FAIL

Interpretation:

  • Real CLI lifecycle coverage is much stronger now: stack, network add/status, agent setup/auth/wallet, sell pricing, ServiceOffer create/test/update/stop/delete, tunnel status, and stack down all ran through obol commands.
  • Two recorded harness failures are expected non-zero outcomes and should be fixed in the QA harness, not product behavior: invalid endpoint rejection and post-delete NotFound verification.
  • Three items are useful follow-ups:
    • Add a wait/retry around obol network remove after network add, or make the CLI tolerate immediate remove after add without surfacing the eRPC restart rate-limit.
    • Make obol sell status <name> display enough pricing detail after sell update for CLI-only QA to verify the effective price without falling back to raw CRD inspection.
    • Decide whether obol tunnel logs should support --tail, or update QA docs/scripts to call the supported syntax.
  • Cleanup remains a QA-host concern: stack purge --force can time out when runtime data is sudo-owned and sudo credentials are not cached. The cluster itself was deleted, and no PR-specific k3d cluster was left behind.

This does not change the earlier release-smoke conclusion: flow-11-dual-stack is useful and passed with receipts, but the full QA fleet is still not release-green.

@bussyjd
Copy link
Copy Markdown
Collaborator

bussyjd commented May 1, 2026

Full flows/ QA pass

This is the canonical flow-script QA pass, not the extra ad hoc CLI lifecycle harness.

Commit under test: 38d50fffee310a5fa81a97a34f6c67b03cf53abf

Setup:

  • Fresh isolated worktrees on two QA machines with sudo access.
  • Baseline/stateful flows ran in order on one fresh worktree.
  • Independent dual-stack flows ran in separate fresh worktrees so flow-11, flow-14, and flow-13 could not poison each other through reused k3d state.
  • No PR-specific k3d cluster remains after the run. Existing unrelated stopped clusters were left untouched.

Important facilitator note:

  • Live Base Sepolia OBOL does not require a local facilitator. flow-14-live-obol-base-sepolia.sh used the deployed public facilitator at https://x402.gcp.obol.tech and passed end-to-end.
  • The missing local x402-facilitator only affects fork/local regression flows: flow-10, flow-12, and flow-13.

Flow Results

Flow Result Notes
flow-01-prerequisites PASS Docker, Ollama, CLI, k3d, Python deps, remote-signer chart 0.3.2
flow-02-stack-init-up PASS Stack init/up, pods healthy, frontend, eRPC, Base Sepolia routing, monitoring
flow-03-inference PASS Host Ollama, in-cluster Ollama, LiteLLM, tool-call passthrough
flow-04-agent FAIL Hermes agent init mostly passed, but agent inference, hello response, gateway health, and dashboard deeplink failed
flow-05-network PASS network list/status/add/remove and invalid endpoint rejection
flow-06-sell-setup PASS x402 components, pricing, ServiceOffer, HTTPRoute
flow-07-sell-verify PASS tunnel, local/tunnel 402, verifier metrics/logs, ServiceOffer Ready
flow-08-buy FAIL paid inference succeeded, but settlement assertions failed: no expected USDC Transfer found and seller balance unchanged
flow-09-lifecycle PASS sell list/status/stop/delete and resource cleanup
flow-10-anvil-facilitator FAIL fork-local facilitator binary missing; not required for live Base Sepolia
flow-11-dual-stack PASS USDC dual-stack seller/buyer passed with ERC-8004 registration, PurchaseRequest, paid inference, settlement receipt, balance deltas
flow-12-obol-payment FAIL expects deployment/openclaw -n openclaw-obol-agent; current default stack path deployed Hermes, so OpenClaw rollout was not present
flow-13-dual-stack-obol PASS/SKIPPED script exits pass after skipping because no local X402_FACILITATOR_BIN / X402_RS_DIR; should probably report SKIPPED explicitly
flow-14-live-obol-base-sepolia PASS live OBOL Base Sepolia seller/buyer passed end-to-end using deployed facilitator

Baseline/stateful flow summary:

flow-01 PASS
flow-02 PASS
flow-03 PASS
flow-04 FAIL
flow-05 PASS
flow-06 PASS
flow-07 PASS
flow-10 FAIL
flow-12 FAIL
flow-08 FAIL
flow-09 PASS
FAILED_COUNT=4

Independent dual-stack summary:

flow-11 PASS
flow-14 PASS
flow-13 PASS/SKIPPED
FAILED_COUNT=0

Live Chain Evidence

Network: Base Sepolia (84532)

Deployed facilitator: https://x402.gcp.obol.tech

OBOL token:

Field Value
Address 0x54AE82bc871a4E3E8E2FE1173Cb864B8563D44D4
Name Obol Network
Symbol OBOL
Decimals 18
EIP-712 domain separator 0xc21da3ed0501015df2d9efb304b2abbdabeb86398c8fc729d491740a061e9b25

Wallet roles:

Role Address Source
Alice / seller / register 0xC0De030F6C37f490594F93fB99e2756703c4297E .env signer
Bob / buyer / payer 0x57b0eF875DeB5A37301F1640E469a2129Da9490E deterministic derived buyer wallet

USDC dual-stack receipts (flow-11):

Purpose Tx hash Status
ERC-8004 registration 0x4c54519599498eaef85bcb7b24cd132ccf505eb65bc245992b811e0e3b1ff329 archived
Metadata 0x968c58d524cd3072f178f710795ee49e486467a2ff576ed617c90c4e98cf04c8 archived
Settlement 0xdb4a1fbb3e0efc7bac6eaf81f3b2560a02dcc4bf2a4311523b78fd155c745db6 amount verified

USDC balance deltas (flow-11):

Token Address Before After Delta
USDC Alice 8,923,000 micro-USDC 8,924,000 micro-USDC +1,000
USDC Bob signer 4,995,000 micro-USDC 4,994,000 micro-USDC -1,000

OBOL live receipts (flow-14):

Purpose Tx hash Status
ERC-8004 registration 0x0736fd3c09d463be828e99d47552a3b79b08cc9e36cb98e4e0e2d665871acb52 archived
Metadata 0xa1c8e0f21bb828c958d67538e60ad222a242bdfbf56f90289fc12d24fff7a892 archived
Settlement 0x15775ee9a327a985a37476337688313d2993dc301b3501a9cf6e573de0fadb92 amount verified

OBOL balance deltas (flow-14):

Token Address Before After Expected delta Actual delta
OBOL Alice 0 wei 1000000000000000 wei +1000000000000000 +1000000000000000
OBOL Bob signer 999999006000000000000000 wei 999999005000000000000000 wei -1000000000000000 -1000000000000000

flow-14 also verified:

  • Bob signer wallet equals the deterministic derived buyer address.
  • Bob signer OBOL balance is visible through Bob eRPC before purchase.
  • Alice OBOL ServiceOffer reached Ready=True.
  • Bob PurchaseRequest reached Ready=True.
  • Buyer sidecar had alice-obol auths with remaining=5 before paid inference.
  • Paid inference returned HTTP 200 via paid/qwen3.5:9b.

Assessment

Useful and release-relevant:

  • flow-11-dual-stack passed and proves the USDC seller/buyer path.
  • flow-14-live-obol-base-sepolia passed and proves live OBOL seller/buyer payment through the deployed facilitator.
  • The schema/atomic-units changes remain useful; the live OBOL path used atomic OBOL wei values and verified exact settlement deltas.

Still not full-fleet green:

  • flow-04-agent has Hermes HTTP/gateway failures.
  • flow-08-buy still has a single-stack settlement assertion failure despite paid inference success.
  • flow-12-obol-payment appears stale or mismatched with the current default Hermes runtime because it expects OpenClaw to be deployed.
  • flow-13 should not be reported as PASS when it skips due missing fork-local facilitator; it should be explicit SKIPPED.
  • Fork/local facilitator flows should stay optional and separately named; live Base Sepolia should remain the default OBOL QA path.

@bussyjd
Copy link
Copy Markdown
Collaborator

bussyjd commented May 1, 2026

QA update: resilient remote flow run

Commit under remote test: e8972ee9c1b940ac07feb4a89027a8ccf6a42b1a
Current PR head after cleanup hardening: aa18a648fbdf69144f89fdf6fbe6b114afa431ad

Validation

CI checks:

Check Status
PR CI on e8972ee passed
PR CI on aa18a64 queued/in progress at time of report

Local checks on aa18a64:

bash -n flows/lib.sh flows/flow-10-anvil-facilitator.sh flows/flow-11-dual-stack.sh flows/flow-12-obol-payment.sh flows/flow-13-dual-stack-obol.sh flows/flow-14-live-obol-base-sepolia.sh: PASS
git diff --check: PASS

Flow tests:

Flow Network QA machine label Worktree Result Artifacts
flow-13-dual-stack-obol.sh Anvil fork of Base Sepolia QA-A fresh detached worktree at e8972ee FAIL as intended at Step 43: agent refused to run buy.py; __FLOW13_DONE_RC__=1 log retained in QA worktree
flow-14-live-obol-base-sepolia.sh live Base Sepolia QA-B fresh detached worktree at e8972ee PASS; steps_failed=0, __FLOW14_DONE_RC__=0 receipt summary retained in QA worktree

Flow 13 Findings

The local facilitator container path did not regress:

Facilitator image: ghcr.io/x402-rs/x402-facilitator:1.4.7
Container startup: PASS
/supported base-sepolia v1+v2 exact: PASS
Alice cluster reaches Anvil via host.k3d.internal: PASS
Bob cluster reaches Anvil via host.k3d.internal: PASS
OBOL ServiceOffer Ready=True: PASS
402 gate: PASS

The run then hit the expected non-green condition we wanted to stop masking:

STEP [43] Bob's agent: buy 5 OBOL Permit2 auths from Alice
Agent response: refused to execute arbitrary scripts / scripts outside configured skills
Result: FAIL: Agent refused to run buy.py
Exit: __FLOW13_DONE_RC__=1

That confirms the previous false positive is gone. The fork flow no longer records “buy command issued” when the agent explicitly refuses to buy.

Cleanup note: this failure exposed that early exits could leave scoped stacks behind, so aa18a64 adds trap cleanup for flow 11/13/14. The QA-A scoped clusters, local facilitator container, and Anvil process were manually cleaned after the run.

Live Chain Evidence For Flow 14

Network: Base Sepolia
Facilitator: https://x402.gcp.obol.tech
OBOL token: 0x54AE82bc871a4E3E8E2FE1173Cb864B8563D44D4 (Obol Network / OBOL, 18 decimals)

Wallet roles:

Role Address Source
Alice / seller / register 0xC0De030F6C37f490594F93fB99e2756703c4297E .env signer
Bob / buyer / payer 0x57b0eF875DeB5A37301F1640E469a2129Da9490E deterministic derived buyer

Balances:

Token Address Before After Expected delta Actual delta
OBOL Alice 2000000000000000 3000000000000000 +1000000000000000 +1000000000000000
OBOL Bob 999999004000000000000000 999999003000000000000000 -1000000000000000 -1000000000000000

Transaction receipts:

Purpose Tx hash Status
ERC-8004 registration 0x439e6e619ba5a0206dd2afea84f3bc43f9a6e72e74f80888f34017ca82e83af0 archived
Metadata / service offer 0x422a3dcf0ca2fa25faf865e04376d3b533b4e539efb99e15484817aa73b7ef04 archived
Settlement transfer 0xe3f8573bdd1e3065a18f0a672b6522045a1e6cb1f279c8f7428b0717203eeba3 archived

Runtime evidence:

Item Value
PurchaseRequest Ready=True
LiteLLM route paid/qwen3.5:9b
Buyer sidecar auths remaining=5 spent=0 before paid inference
Paid endpoint status HTTP 200
Cleanup scoped QA-B k3d clusters deleted after run

Follow-up

flow-13 is now honest, but still not demo/release-green because the buyer agent can refuse to run buy.py. For a resilient fork regression, the next fix should make that step deterministic, for example by invoking the packaged buy-x402/scripts/buy.py inside the buyer agent pod or by adding an explicit non-LLM CLI path, while keeping the live flow-14 agent path as the demo-grade coverage.

@OisinKyne OisinKyne merged commit 4d81429 into main May 1, 2026
6 checks passed
@OisinKyne OisinKyne deleted the oisin/buy-sell branch May 1, 2026 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants