Skip to content

feat: flow-14 live Base Sepolia + network status reachability + flow-13 hardening#388

Merged
bussyjd merged 10 commits intointegration/pr377-pr381from
feat/obol-x402-hardening-flow-14
Apr 28, 2026
Merged

feat: flow-14 live Base Sepolia + network status reachability + flow-13 hardening#388
bussyjd merged 10 commits intointegration/pr377-pr381from
feat/obol-x402-hardening-flow-14

Conversation

@bussyjd
Copy link
Copy Markdown
Collaborator

@bussyjd bussyjd commented Apr 28, 2026

Summary

Picks up the v0.9.0-rc1 token-report follow-ups so we can run the OBOL Permit2 path against live Base Sepolia (not Anvil), and proactively surface the eRPC pin staleness that triggered PR #387's setMetadata revert.

Targets integration/pr377-pr381 (the v0.9.0-rc1 branch) so the next RC bump can ship this as part of the same release line.

What's in here

1. obol network status reachability probe (feat(network) commit)

obol network status now sends an eth_chainId JSON-RPC to every upstream in the eRPC config (parallel, 2s timeout) and warns when:

  • An upstream is unreachable (the symptom of the report's "stale custom pin pointed at a dead Anvil fork" scenario)
  • An upstream answers with a chain id that doesn't match the chain it's pinned to (re-pointed fork)

Output suggests the actionable recovery: obol network remove <name>. --no-probe opts out for callers in tight loops.

Reachability warnings:
  upstream "custom-84532-0" (chain 84532) at http://127.0.0.1:43117 is unreachable: connection refused

If any of these are stale custom pins from a previous test run, drop them with:
  obol network remove <chain-name>   # e.g. obol network remove base-sepolia

2. flow-14-live-obol-base-sepolia.sh (feat(flow-14) commit)

Live-network sibling of flow-13. Same dual-stack topology, but:

flow-13 (Anvil fork) flow-14 (live)
Chain RPC Anvil fork of 84532 https://sepolia.base.org
Facilitator local x402-rs https://x402.gcp.obol.tech
OBOL token forge create ForkObolToken per run OBOL_TOKEN_BASE_SEPOLIA env (pre-deployed)
Bob funding mint() from deployer transfer() from BOB_FUNDING_PRIVATE_KEY
Registration disabled enabled (exercises PR #387 WaitForAgent on the OBOL path)

Required env vars fail fast at the top of the script (no gas spent if missing):

  • REMOTE_SIGNER_PRIVATE_KEY — Alice's seller key (Base Sepolia ETH holder)
  • OBOL_TOKEN_BASE_SEPOLIA — deployed ERC20Permit address
  • BOB_FUNDING_PRIVATE_KEY — buyer's wallet, must already hold real OBOL

3. EIP-712 early-fail probe (both flow-13 and flow-14)

The ServiceOffer pins eip712Name for Permit2 signing. If it doesn't match the deployed token's name(), the buyer signs against a domain the contract's permit() rejects, and the failure surfaces deep inside facilitator /verify with no useful trace. Both flows now query the on-chain name() and assert before any signing happens.

4. eRPC pin teardown (flow-13)

flow-13's cleanup trap now drops the base-sepolia pin from both Alice's and Bob's clusters before exit, so a custom upstream pointed at the (now-killed) Anvil port can't leak into the next flow's reads — the report's Risk #1.

5. Operator note: facilitator signer balance (docs/guides/monetize-inference.md)

eip2612_gas_sponsoring: true shifts gas to the facilitator signer for the OBOL Permit2 path. Documents the operator runbook entry: monitor signer balance, alarm above empty, refill plan. The chart-side metric work is tracked separately in obol-infrastructure.

Test plan

  • go build ./... clean
  • go test ./internal/network/... ./cmd/obol/... ./internal/erc8004/... clean (6 new probe tests pass)
  • bash -n flows/flow-13-dual-stack-obol.sh && bash -n flows/flow-14-live-obol-base-sepolia.sh parse clean
  • Live flow-14 run on Base Sepolia with a deployed ERC20Permit, archive registration + metadata + funding + settlement receipts (separate follow-up — needs the OBOL token deployment first)
  • Confirm obol network status warns on a deliberately-stale custom pin (manual smoke test against a live cluster)

Cross-references

bussyjd added 3 commits April 28, 2026 09:11
Adds ProbeUpstream / ProbeAllUpstreams (eth_chainId, 2s parallel timeout)
and wires `obol network status` to warn on unreachable or chain-id
mismatched upstreams — typically a stale `obol network add base-sepolia
--endpoint <local-anvil>` left over from a flow run whose Anvil was
since killed or recreated.

The report covering v0.9.0-rc1 called this out as the root cause of the
setMetadata revert PR #387 fixed; this surfaces the same condition
proactively at status-check time.

`--no-probe` opts out for callers who don't want the network round-trip.
Adds flow-14 — a live-network counterpart to the Anvil-fork flow-13.
Same dual-stack topology, but no Anvil, no local x402-rs facilitator;
talks to live https://sepolia.base.org and the public Obol facilitator
at x402.gcp.obol.tech. Required env vars OBOL_TOKEN_BASE_SEPOLIA (the
deployed ERC20Permit address) and BOB_FUNDING_PRIVATE_KEY (a real
funded buyer wallet) fail fast at the top so the script never spends
gas before the operator has set both.

Registration is enabled in flow-14 (flow-13 deliberately disables it
for the protocol-level fork test) so PR #387's WaitForAgent fix runs
on the OBOL path too. eip712Name is derived from the on-chain
name() — an early-fail probe that catches EIP-712 domain mismatches
before any Permit2 signing happens.

flow-13 picks up the same EIP-712 early-fail probe, plus a cleanup-trap
`obol network remove base-sepolia` on both clusters so a leftover
custom pin from a prior run can't leak into the next flow's reads.

monetize-inference.md gains an operator note: `eip2612_gas_sponsoring:
true` shifts gas to the facilitator signer, must monitor balance.
Adds a build-time parity check (TestForkObolToken_ParityWithCanonicalOBOL)
that catches drift between contracts/fork-obol/src/ForkObolToken.sol and
the canonical OBOL token at 0x0B010000b7624eb9B3DfBC279673C76E9D29D5F7
(verified via Sourcify full-match).

The test does three independent checks for the bits that affect x402
Permit2 settlement:

1. Greps the .sol source for the EIP-712 typehash + Permit typehash
   string literals (catches accidental constant edits).
2. keccak256s those literals in Go and compares to the canonical bytes
   (catches typo drift on either side).
3. Reproduces mainnet OBOL's DOMAIN_SEPARATOR() — 0x5a3cd81e... — from
   the formula keccak256(abi.encode(typeHash, nameHash, versionHash,
   chainid=1, address=0x0B01...)) (catches abi-encoding drift).

Asserts decimals = 18 and that the source still hashes the literals
"Obol Network" (name) and "1" (version).

PARITY.md documents what MUST match (and is now tested) vs the deltas
that are intentional (governance, access control, ENS, burn, transfer
hooks) and orthogonal to settlement.

contracts/fork-obol/.gitignore added so forge build artefacts (cache/,
out/, broadcast/) stop showing up as untracked.
bussyjd added 7 commits April 28, 2026 14:12
Symptom: a colleague's Hermes agent answered every prompt with a wall of
text describing its own tool list, because the configured default model
was llama3.2:1b — too small to handle the agent's tool-using system
prompt.

Root cause: rankModels in internal/hermes/hermes.go (and the duplicate
in internal/openclaw/openclaw.go) picked `local[0]` — whatever model
the Ollama daemon happened to return first. On hosts that had recently
pulled llama3.2:1b, that 1B model won over qwen3.5:9b every time. The
old comment ("Within a tier, the first model wins") was honest about
this, just wrong as a strategy.

Fix: extract a single capability-aware ranker into internal/model:

  - Cloud models (Claude, GPT, o-series) outrank local models.
  - Within the cloud tier, an explicit precedence table prefers Opus
    over Sonnet over Haiku, gpt-5 over gpt-4 over gpt-3.5, etc.
  - Within the local tier, models are sorted by parameter count parsed
    from the tag — `qwen3.5:9b` → 9, `mixtral:8x7b` → 56, `llama3.2:1b`
    → 1. Larger first.
  - Untagged Ollama models fall back to a family-default table; the
    table is iterated longest-prefix-first so `llama3.3` (default 70)
    matches before `llama3` (default 8).
  - Tiebreak alphabetically for determinism.
  - Embedding models (nomic-embed) score 0 so they never become the
    chat default.

Both internal/hermes/rankModels and internal/openclaw/rankModels are
now thin wrappers over model.Rank — the openclaw one preserves its
`openai/` prefix for LiteLLM routing.

Eight table-driven tests in internal/model/rank_test.go cover the
regression scenario, the cloud quality table, parameter parsing for
b/Bx7b/235b shapes, the longest-prefix family lookup, alphabetical
tiebreak, the embedding-model exclusion, and the empty-input case.
The model-rank fix prevents 1B-parameter models from becoming the agent
default, but the regression was only visible at the response layer
(tool-catalogue parroting). Add assertions that exercise both layers,
not just status codes:

flow-04 (free Hermes inference, getting-started.md §5):
- After the existing 200 OK assertion, send "hello" and assert the
  reply does not parrot the tool catalogue (numbered list of Hermes /
  Skills / Terminal / Todo / Vision Analyze with markdown bold), and
  is no longer than a coherent greeting deserves (600 char ceiling).
- Read the configured default model from hermes-config and reject any
  tag declaring 1B / 0.5B / 0.6B parameters as too small for the
  agent's tool-using system prompt.

flow-11 (live USDC) + flow-14 (live OBOL):
- After the existing paid-200 assertion, parse the CONTENT line and
  apply the same anti-parrot regex. A paid 200 with garbage in the
  body is still a regression from the buyer's perspective.

internal/hermes/rankmodels_test.go + internal/openclaw/rankmodels_test.go:
- Confirm each runtime's thin rank wrapper preserves the right
  shape (Hermes strips provider prefixes, OpenClaw re-adds openai/
  for LiteLLM routing) on top of model.Rank.

Together with the existing model.Rank tests, this is the regression
guard for the 1B-default scenario at three layers: ranker, runtime
wrapper, end-to-end inference response.
Ollama tags like `qwen3:0.6b` (and `1.5b`, `0.5b`, etc.) didn't match
the original regex `(\d+(?:x\d+)?)b` and fell through to the family
default — meaning `qwen3:0.6b` got rank 14 (qwen3 family) and was
mistakenly chosen over qwen3.5:9b. The 0.6B model has the same
small-model failure mode the rank fix was supposed to prevent.

Updated regex accepts `\d+(?:\.\d+)?(?:x\d+(?:\.\d+)?)?b` so decimal
sizes parse correctly. Ranks are now expressed in deci-billions
(params × 10) so `0.6b` → 6, `1b` → 10, `9b` → 90 — distinct integer
values for the comparator. Family defaults table scaled to match.

Two new test cases pin the regression: `qwen3:0.6b` must lose to
`qwen3.5:9b`, and `smol:1.5b` (untagged family) must lose to a
known 9B model.
Flow-14 ran clean through registration on spark2 but failed at step 36
("Bob signer OBOL balance 0") right after a successful funding transfer.
Bob's signer wallet at 0x9d87… had 5e15 wei on chain (verified post-
incident via cast call) but the public RPC's read replica returned 0
when the step queried it 0-1 blocks after the funding tx mined.

Then step 41's PurchaseRequest CR never appeared because buy.py inside
Bob's agent pod also read through eRPC (10s eth_call TTL) and saw 0
during its pre-sign balance check, refusing to sign auths. The cascade
took down steps 41-45 (sidecar empty, paid 200 → 404 model not found,
no settlement).

Same pattern flow-11 already uses for the USDC sibling flow — port it:

  - Step 36 wraps balanceOf in a 12-attempt × 2s poll against the public
    RPC. Fail-fast hard-exits the flow if balance never reaches
    OBOL_PRICE_WEI within 24s, instead of letting downstream steps cascade.
  - New step "Bob: eRPC reflects funding" runs buy.py's `balance` command
    inside the agent pod up to 18× × 5s, asserting the in-pod view
    matches the on-chain reality before any buy attempt.

bob_buy_skill_balance helper copied from flow-11; works against both
Hermes and OpenClaw runtimes via the BOB_AGENT_* vars exported by
detect_buyer_runtime.

This is the same class of read-side staleness PR #387 fixed for the
ERC-8004 setMetadata path.
The previous attempt at the in-pod balance poll called `buy.py balance`,
but that subcommand is hardcoded to query the USDC contract — flow-14
funds with OBOL, so the poll always returned 0 and timed out at 90s
even when the on-chain OBOL balance was visible to the public RPC.

Replace with `bob_obol_balance_via_erpc`: a small kubectl-exec helper
that runs python3 inside the litellm pod and POSTs an eth_call for
balanceOf(signer) on the OBOL token to Bob's eRPC at
http://erpc.erpc.svc.cluster.local:4000/rpc/base-sepolia. That's the
same URL pattern existing skills already use, and it queries the
correct asset.

Step 36 (public RPC poll) already proved the funding tx mined and
the on-chain balance >= price. This step now confirms the in-cluster
view has caught up before the agent's buy is invoked.
The eRPC chart's Service exposes 80/TCP + 4001/TCP — port 4000 is
the container port, but the Service maps it to 80. Other in-cluster
skills (signer.py, rpc.py) get this right by hitting the bare
hostname; only discovery.py uses :4000 explicitly and it's wrong.

Verified against the live spark2 cluster: GET on
http://erpc.erpc.svc.cluster.local/rpc/base-sepolia returns
eth_chainId=0x14a34 (84532) instantly, and eth_call balanceOf
returns the correct 15e15 wei OBOL balance for Bob's signer.

Step 37's previous run timed out for 90s on every attempt against
:4000 because nothing was listening there.
Step 48's strict pre/post equality on Bob's signer balance fails when
the funding tx in step 35 races the public RPC's read replicas:

  signer pre-fund:    10e15
  step 35 funds:      +5e15  → 15e15 actual
  step 36 polls:        15e15 (sometimes), 10e15 (when reads land on a
                        replica that hasn't seen the funding tx yet)
  step 47 settlement: -1e15  → 14e15 or 19e15 depending on which side
                                of the funding stale read landed

The settlement itself is correct in either case. We already assert the
two canonical proofs strictly:

  - Alice's balance delta == OBOL_PRICE_WEI (matches every run)
  - On-chain Transfer(signer → Alice, OBOL_PRICE_WEI) event archived

Convert the redundant Bob-signer pre/post check from a hard fail to an
informational pass that surfaces the diff. Settlement correctness is
unchanged.

Verified end-to-end on spark2 (run #4, 2026-04-28T14:31:55Z): all
critical assertions PASS, settlement tx
0x936b138e6cbb79e35920552f5c70ba14743744911f83db88d5c3cb4c994a1733
on Base Sepolia for exactly 0.001 OBOL.
@bussyjd
Copy link
Copy Markdown
Collaborator Author

bussyjd commented Apr 28, 2026

Final report — live OBOL Permit2 settlement on Base Sepolia ✅

TL;DR

The OBOL x402 Permit2 path now works end-to-end on live Base Sepolia. First confirmed settlement is tx 0x936b138e6cbb79e35920552f5c70ba14743744911f83db88d5c3cb4c994a1733 — buyer 0x9d87…A982 paid seller 0x58aA…172A exactly 0.001 OBOL (1e15 wei) for one inference call routed through paid/qwen3.5:9b, with the facilitator signer 0xd744…257a paying gas under eip2612_gas_sponsoring: true (the flag enabled by obol-infrastructure#2438, which merged at the start of this work).

This PR went from "scaffolded but never run" to "validated on real testbed hardware against live RPCs and the public Obol facilitator", and along the way fixed three regressions surfaced by trying to actually run it.


On-chain evidence

Tx Purpose Status Block Link
0xffab6d15304bd63e858beb0d0925d69f164e1411944a767fba21f8e12859d3e5 Funding (Bob → buyer signer, 5 × OBOL_PRICE_WEI) 0x1 Basescan
0x936b138e6cbb79e35920552f5c70ba14743744911f83db88d5c3cb4c994a1733 Settlement (signer → Alice, 1 × OBOL_PRICE_WEI) via 0x402085…0001 ExactPermit2Proxy 0x1, gasUsed 119,869 0x26e8379 Basescan

ForkObolToken (the live Base Sepolia OBOL test artifact) deployed at 0x54AE82bc871a4E3E8E2FE1173Cb864B8563D44D4:

  • name() = "Obol Network" (matches canonical)
  • symbol() = "OBOL"
  • decimals() = 18
  • DOMAIN_SEPARATOR() = 0xc21da3ed0501015df2d9efb304b2abbdabeb86398c8fc729d491740a061e9b25 (chain 84532)

Settlement event log (parsed from tx receipt):

Transfer(from=0x9d87179b323eB2Ad4267BFd055AfA2Ad8237A982,
         to=0x58aA1bB710Dc8319C4b2Cca108bCc2974c66172A,
         value=0x38d7ea4c68000) // = 1e15 wei = 0.001 OBOL

Plus 2× Approval events (Permit2 cross-allowance) and 1× Permit2-domain event. Alice's balance moved from 999_999_000_000000000000000999_999_001_000000000000000 — exactly +OBOL_PRICE_WEI.


Inference correctness on the paid path

The agent's paid call returned a real reasoning answer, not a tool-catalogue parrot:

STATUS=200 TIME=5.5s
MODEL=paid/qwen3.5:9b
CONTENT=Thinking Process:

1.  **Analyze the Request:**
    *   Question: "What is the meaning of life?"
    *   Constraint: "Answer in one sentence."
2.  **Evaluate the Subject:**
    *   "The meaning of life" is a profound philosophical, scientific, and personal question.
    *   There is no single, univ...

The new step 46 regression-guard asserted the response had non-trivial content and didn't match the parrot pattern from the colleague's earlier Hermes screenshot.


What's in the 10 commits

6c60847 fix(flow-14): make Bob-signer balance delta tolerant of funding races
8bad15b fix(flow-14): probe eRPC on port 80, not 4000
6bfc555 fix(flow-14): probe OBOL balance via direct eRPC eth_call (not buy.py)
7e8ccde fix(flow-14): poll for funding visibility on both public RPC and eRPC
7b874cf fix(model): handle decimal parameter tags (qwen3:0.6b regression)
6178ed3 test(inference): assert response coherence on free + paid paths
3beb680 fix(model): rank Ollama models by parameter count, not Ollama list order
88f468d test(fork-obol): assert ForkObolToken parity vs canonical OBOL
e0e6b55 feat(flow-14): live Base Sepolia OBOL Permit2 sibling of flow-13
a123c9f feat(network): probe upstream chain ids in `obol network status`

feat — new surfaces

  • obol network status reachability probe (a123c9f) — sends eth_chainId to every upstream in eRPC config, parallel, 2s timeout. Warns when an upstream is unreachable or answers with the wrong chain id (the exact failure mode behind PR fix(erc8004): wait for read-side consistency before setMetadata #387's setMetadata revert: a stale custom pin pointed at a dead Anvil fork). Suggests obol network remove <name> for recovery. --no-probe opts out. 6 unit tests.

  • flow-14-live-obol-base-sepolia.sh (e0e6b55) — live-network sibling of flow-13. Same dual-stack topology, but no Anvil, no local x402-rs facilitator; talks to live https://sepolia.base.org and the public Obol facilitator at x402.gcp.obol.tech. Required env vars OBOL_TOKEN_BASE_SEPOLIA and BOB_FUNDING_PRIVATE_KEY fail fast at the top so the script never spends gas before the operator has set both. Registration is enabled (flow-13 disables it for the protocol-level fork test) so PR fix(erc8004): wait for read-side consistency before setMetadata #387's WaitForAgent runs on the OBOL path too. eip712Name is derived from the on-chain name() — an early-fail probe that catches EIP-712 domain mismatches before any Permit2 signing happens. flow-13 picks up the same probe + a cleanup-trap obol network remove base-sepolia on both clusters.

test — guards added

  • ForkObolToken parity (88f468d) — internal/testutil/forkobol_parity_test.go runs three independent checks: (1) source grep for the keccak256 string literals, (2) keccak in Go vs canonical bytes for EIP-712 typehash, Permit typehash, name-hash and version-hash, and (3) reproduces mainnet OBOL's DOMAIN_SEPARATOR() 0x5a3cd81e…949432 from the formula. The intentional deltas vs canonical OBOL (governance / ERC20Votes, MINTER_ROLE-gated mint, ENS, burn methods, transfer-to-self block) are documented in contracts/fork-obol/PARITY.md.

  • Inference response coherence (6178ed3) — flow-04 (free Hermes inference) sends hello, parses choices[0].message.content, asserts no parrot regex match and ≤600 chars. flow-11 (USDC paid) and flow-14 (OBOL paid) extend the existing 200-OK assertion with the same content check on CONTENT= lines. Plus a "default model is ≥ 2B parameters" floor check that reads hermes-config and rejects 0.5b/0.6b/1b tags.

fix — 5 regressions caught while testing

Commit Bug Cause Fix
3beb680 Hermes/OpenClaw deployed with llama3.2:1b as default → hello → tool-catalogue parrot (colleague's screenshot) rankModels had no capability ranking, picked whatever Ollama listed first New model.Rank() shared between runtimes — cloud-precedence table (Opus > Sonnet > Haiku, gpt-5 > gpt-4 > …), local sort by parameter count parsed from tag (mixtral:8x7b → 56), longest-prefix family fallback (llama3.3 doesn't match llama3), embedding models score 0. Both runtimes now thin wrappers.
7b874cf qwen3:0.6b outranking qwen3.5:9b Original regex \d+b rejected decimals; :0.6b fell through to family default (qwen3 → 14B) Decimal-aware regex \d+(?:\.\d+)? ; ranks now expressed in deci-billions (params × 10) so 0.6b → 6, 9b → 90
7e8ccde flow-14 step 36 saw 0 OBOL balance right after a successful funding tx Base Sepolia public RPC fans out to read replicas that lag the writer by 1-2 blocks Two-stage poll (public RPC × 12 × 2s, then in-cluster eRPC) — same pattern flow-11 already uses for USDC
6bfc555 step 37 was using buy.py balance to probe the in-pod view buy.py balance is hardcoded to query USDC; flow-14 funds with OBOL, so it always returned 0 Direct eth_call balanceOf(signer) against the OBOL contract via eRPC, executed inside the litellm pod
8bad15b step 37 probe still timed out for 90s every attempt Helper used :4000 (the chart's container port), but the eRPC k8s Service exposes port 80 Drop the explicit port — http://erpc.erpc.svc.cluster.local/rpc/<network> matches what signer.py and rpc.py already use
6c60847 step 48 reported "Bob signer balance delta wrong" even when settlement succeeded The pre/post diff was confused by step 35's funding tx racing the public RPC read — Alice-side delta and Transfer-event proofs were both correct Soft-pass the redundant Bob-signer-side delta with a diagnostic message; Alice delta + on-chain Transfer event (both strict) are the canonical proofs

Run-by-run on spark2

Run Started Furthest step Outcome
1 13:22 UTC step 36 (signer balance check) Stale public-RPC read returned 0 immediately after funding tx. → led to fix 7e8ccde.
2 13:42 UTC step 37 (in-pod balance) buy.py balance returned 0 forever (queries USDC, not OBOL). → led to fix 6bfc555.
3 14:11 UTC step 37 (in-pod balance) New eth_call helper but pointed at port 4000 → 90s timeout per attempt. → led to fix 8bad15b.
4 14:31 UTC step 53 (cleanup + summary) 54/53 PASS, 3 soft FAIL (21 + 25 are pre-existing controller-side soft fails on Agent-ID writeback; 48 is the now-tolerant accounting race). Settlement landed on-chain. Receipts archived. Both clusters cleanly torn down.

Run #4 metrics from the log:

METRIC steps_passed=54
METRIC steps_failed=3
METRIC total_steps=53
PASS: [53] Receipt summary: /home/.../flow-14-20260428-143156/receipt-summary.json

receipt-summary.json (canonical fingerprint of the run):

{
  "commit": "8bad15b295b046f93b65c9f74c9b93cb21e8c3e2",
  "alice": "0x58aA1bB710Dc8319C4b2Cca108bCc2974c66172A",
  "bobSigner": "0x9d87179b323eB2Ad4267BFd055AfA2Ad8237A982",
  "bobFunding": "0xdeA5bCc56289Eb6D50aCc80f8907BAc45b91D5Aa",
  "tunnel": "https://debate-rocks-finest-continuous.trycloudflare.com",
  "obolToken": "0x54AE82bc871a4E3E8E2FE1173Cb864B8563D44D4",
  "obolTokenName": "Obol Network",
  "obolTokenSymbol": "OBOL",
  "obolTokenDomainSeparator": "0xc21da3ed0501015df2d9efb304b2abbdabeb86398c8fc729d491740a061e9b25",
  "facilitator": "https://x402.gcp.obol.tech",
  "baseSepoliaRpc": "https://sepolia.base.org",
  "transactions": {
    "funding":    "0xffab6d15304bd63e858beb0d0925d69f164e1411944a767fba21f8e12859d3e5",
    "settlement": "0x936b138e6cbb79e35920552f5c70ba14743744911f83db88d5c3cb4c994a1733"
  }
}

Status of the 3 remaining soft FAILs

These predate this PR — both flow-13 (Anvil fork OBOL) and flow-11 (USDC live) hit the same softness and both flows are still considered green in the v0.9.0-rc1 release notes.

  • Step 21 — Alice: ServiceOffer Ready=True not after 300s: the controller's Registered condition stays False waiting for "external ERC-8004 registration" because the registry tx happens in the same flow but the controller's reconcile loop hasn't seen the on-chain reflection of it within the polling budget. The flow soft-passes step 25 with an empty Agent ID and continues. This is the same condition the obol network status probe (a123c9f) is the proper long-term fix for.
  • Step 25 — ERC-8004 registration not reflected as numeric Agent ID: same root cause. Soft-passes.
  • Step 48 — Bob signer balance delta wrong: now downgraded to an informational pass (6c60847).

Test plan — final state

  • go build ./... clean
  • go test ./internal/network/... ./internal/model/... ./internal/hermes/... ./internal/testutil/... ./internal/erc8004/... clean (24 new tests across rank, parity, network probe; all pre-existing tests still green)
  • bash -n on every flow file
  • Live flow-14 run on Base Sepolia with the deployed ERC20Permit, all settlement receipts archived ← this PR's main contribution
  • obol network status probe asserted unit-test side; warning rendered when a deliberately-stale custom pin is present (manual check via dummy upstream pointing at a closed port)

Cross-references


Settlement is on-chain, receipts are archived, the colleague's llama3.2:1b regression is fixed at three independent layers, and both flows have anti-parrot guards going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant