feat: flow-14 live Base Sepolia + network status reachability + flow-13 hardening by bussyjd · Pull Request #388 · ObolNetwork/obol-stack

bussyjd · 2026-04-28T01:11:59Z

Summary

Picks up the v0.9.0-rc1 token-report follow-ups so we can run the OBOL Permit2 path against live Base Sepolia (not Anvil), and proactively surface the eRPC pin staleness that triggered PR #387's setMetadata revert.

Targets integration/pr377-pr381 (the v0.9.0-rc1 branch) so the next RC bump can ship this as part of the same release line.

What's in here

1. `obol network status` reachability probe (`feat(network)` commit)

obol network status now sends an eth_chainId JSON-RPC to every upstream in the eRPC config (parallel, 2s timeout) and warns when:

An upstream is unreachable (the symptom of the report's "stale custom pin pointed at a dead Anvil fork" scenario)
An upstream answers with a chain id that doesn't match the chain it's pinned to (re-pointed fork)

Output suggests the actionable recovery: obol network remove <name>. --no-probe opts out for callers in tight loops.

Reachability warnings:
  upstream "custom-84532-0" (chain 84532) at http://127.0.0.1:43117 is unreachable: connection refused

If any of these are stale custom pins from a previous test run, drop them with:
  obol network remove <chain-name>   # e.g. obol network remove base-sepolia

2. `flow-14-live-obol-base-sepolia.sh` (`feat(flow-14)` commit)

Live-network sibling of flow-13. Same dual-stack topology, but:

	flow-13 (Anvil fork)	flow-14 (live)
Chain RPC	Anvil fork of 84532	https://sepolia.base.org
Facilitator	local x402-rs	https://x402.gcp.obol.tech
OBOL token	`forge create ForkObolToken` per run	`OBOL_TOKEN_BASE_SEPOLIA` env (pre-deployed)
Bob funding	`mint()` from deployer	`transfer()` from `BOB_FUNDING_PRIVATE_KEY`
Registration	disabled	enabled (exercises PR #387 `WaitForAgent` on the OBOL path)

Required env vars fail fast at the top of the script (no gas spent if missing):

REMOTE_SIGNER_PRIVATE_KEY — Alice's seller key (Base Sepolia ETH holder)
OBOL_TOKEN_BASE_SEPOLIA — deployed ERC20Permit address
BOB_FUNDING_PRIVATE_KEY — buyer's wallet, must already hold real OBOL

3. EIP-712 early-fail probe (both flow-13 and flow-14)

The ServiceOffer pins eip712Name for Permit2 signing. If it doesn't match the deployed token's name(), the buyer signs against a domain the contract's permit() rejects, and the failure surfaces deep inside facilitator /verify with no useful trace. Both flows now query the on-chain name() and assert before any signing happens.

4. eRPC pin teardown (flow-13)

flow-13's cleanup trap now drops the base-sepolia pin from both Alice's and Bob's clusters before exit, so a custom upstream pointed at the (now-killed) Anvil port can't leak into the next flow's reads — the report's Risk #1.

5. Operator note: facilitator signer balance (`docs/guides/monetize-inference.md`)

eip2612_gas_sponsoring: true shifts gas to the facilitator signer for the OBOL Permit2 path. Documents the operator runbook entry: monitor signer balance, alarm above empty, refill plan. The chart-side metric work is tracked separately in obol-infrastructure.

Test plan

go build ./... clean
go test ./internal/network/... ./cmd/obol/... ./internal/erc8004/... clean (6 new probe tests pass)
bash -n flows/flow-13-dual-stack-obol.sh && bash -n flows/flow-14-live-obol-base-sepolia.sh parse clean
Live flow-14 run on Base Sepolia with a deployed ERC20Permit, archive registration + metadata + funding + settlement receipts (separate follow-up — needs the OBOL token deployment first)
Confirm obol network status warns on a deliberately-stale custom pin (manual smoke test against a live cluster)

Cross-references

Token-report follow-ups: /Users/bussyjd/Development/Obol_Workbench/obol-stack/.claude/worktrees/integration-pr377-pr381/.tmp/v0.9.0-rc1-obol-x402-report.md (Risk Bring obolup code to this repo #1, Risk Add helm to obolup #2, Lesson Move to namespace, attempt erigon3 #6).
PR fix(erc8004): wait for read-side consistency before setMetadata #387 (the setMetadata fix the probe makes detectable at network-list time).
obol-infrastructure #2438 (the facilitator chart change this PR's flow-14 exercises against the live-cluster facilitator).

Adds ProbeUpstream / ProbeAllUpstreams (eth_chainId, 2s parallel timeout) and wires `obol network status` to warn on unreachable or chain-id mismatched upstreams — typically a stale `obol network add base-sepolia --endpoint <local-anvil>` left over from a flow run whose Anvil was since killed or recreated. The report covering v0.9.0-rc1 called this out as the root cause of the setMetadata revert PR #387 fixed; this surfaces the same condition proactively at status-check time. `--no-probe` opts out for callers who don't want the network round-trip.

Adds flow-14 — a live-network counterpart to the Anvil-fork flow-13. Same dual-stack topology, but no Anvil, no local x402-rs facilitator; talks to live https://sepolia.base.org and the public Obol facilitator at x402.gcp.obol.tech. Required env vars OBOL_TOKEN_BASE_SEPOLIA (the deployed ERC20Permit address) and BOB_FUNDING_PRIVATE_KEY (a real funded buyer wallet) fail fast at the top so the script never spends gas before the operator has set both. Registration is enabled in flow-14 (flow-13 deliberately disables it for the protocol-level fork test) so PR #387's WaitForAgent fix runs on the OBOL path too. eip712Name is derived from the on-chain name() — an early-fail probe that catches EIP-712 domain mismatches before any Permit2 signing happens. flow-13 picks up the same EIP-712 early-fail probe, plus a cleanup-trap `obol network remove base-sepolia` on both clusters so a leftover custom pin from a prior run can't leak into the next flow's reads. monetize-inference.md gains an operator note: `eip2612_gas_sponsoring: true` shifts gas to the facilitator signer, must monitor balance.

Adds a build-time parity check (TestForkObolToken_ParityWithCanonicalOBOL) that catches drift between contracts/fork-obol/src/ForkObolToken.sol and the canonical OBOL token at 0x0B010000b7624eb9B3DfBC279673C76E9D29D5F7 (verified via Sourcify full-match). The test does three independent checks for the bits that affect x402 Permit2 settlement: 1. Greps the .sol source for the EIP-712 typehash + Permit typehash string literals (catches accidental constant edits). 2. keccak256s those literals in Go and compares to the canonical bytes (catches typo drift on either side). 3. Reproduces mainnet OBOL's DOMAIN_SEPARATOR() — 0x5a3cd81e... — from the formula keccak256(abi.encode(typeHash, nameHash, versionHash, chainid=1, address=0x0B01...)) (catches abi-encoding drift). Asserts decimals = 18 and that the source still hashes the literals "Obol Network" (name) and "1" (version). PARITY.md documents what MUST match (and is now tested) vs the deltas that are intentional (governance, access control, ENS, burn, transfer hooks) and orthogonal to settlement. contracts/fork-obol/.gitignore added so forge build artefacts (cache/, out/, broadcast/) stop showing up as untracked.

Symptom: a colleague's Hermes agent answered every prompt with a wall of text describing its own tool list, because the configured default model was llama3.2:1b — too small to handle the agent's tool-using system prompt. Root cause: rankModels in internal/hermes/hermes.go (and the duplicate in internal/openclaw/openclaw.go) picked `local[0]` — whatever model the Ollama daemon happened to return first. On hosts that had recently pulled llama3.2:1b, that 1B model won over qwen3.5:9b every time. The old comment ("Within a tier, the first model wins") was honest about this, just wrong as a strategy. Fix: extract a single capability-aware ranker into internal/model: - Cloud models (Claude, GPT, o-series) outrank local models. - Within the cloud tier, an explicit precedence table prefers Opus over Sonnet over Haiku, gpt-5 over gpt-4 over gpt-3.5, etc. - Within the local tier, models are sorted by parameter count parsed from the tag — `qwen3.5:9b` → 9, `mixtral:8x7b` → 56, `llama3.2:1b` → 1. Larger first. - Untagged Ollama models fall back to a family-default table; the table is iterated longest-prefix-first so `llama3.3` (default 70) matches before `llama3` (default 8). - Tiebreak alphabetically for determinism. - Embedding models (nomic-embed) score 0 so they never become the chat default. Both internal/hermes/rankModels and internal/openclaw/rankModels are now thin wrappers over model.Rank — the openclaw one preserves its `openai/` prefix for LiteLLM routing. Eight table-driven tests in internal/model/rank_test.go cover the regression scenario, the cloud quality table, parameter parsing for b/Bx7b/235b shapes, the longest-prefix family lookup, alphabetical tiebreak, the embedding-model exclusion, and the empty-input case.

The model-rank fix prevents 1B-parameter models from becoming the agent default, but the regression was only visible at the response layer (tool-catalogue parroting). Add assertions that exercise both layers, not just status codes: flow-04 (free Hermes inference, getting-started.md §5): - After the existing 200 OK assertion, send "hello" and assert the reply does not parrot the tool catalogue (numbered list of Hermes / Skills / Terminal / Todo / Vision Analyze with markdown bold), and is no longer than a coherent greeting deserves (600 char ceiling). - Read the configured default model from hermes-config and reject any tag declaring 1B / 0.5B / 0.6B parameters as too small for the agent's tool-using system prompt. flow-11 (live USDC) + flow-14 (live OBOL): - After the existing paid-200 assertion, parse the CONTENT line and apply the same anti-parrot regex. A paid 200 with garbage in the body is still a regression from the buyer's perspective. internal/hermes/rankmodels_test.go + internal/openclaw/rankmodels_test.go: - Confirm each runtime's thin rank wrapper preserves the right shape (Hermes strips provider prefixes, OpenClaw re-adds openai/ for LiteLLM routing) on top of model.Rank. Together with the existing model.Rank tests, this is the regression guard for the 1B-default scenario at three layers: ranker, runtime wrapper, end-to-end inference response.

Ollama tags like `qwen3:0.6b` (and `1.5b`, `0.5b`, etc.) didn't match the original regex `(\d+(?:x\d+)?)b` and fell through to the family default — meaning `qwen3:0.6b` got rank 14 (qwen3 family) and was mistakenly chosen over qwen3.5:9b. The 0.6B model has the same small-model failure mode the rank fix was supposed to prevent. Updated regex accepts `\d+(?:\.\d+)?(?:x\d+(?:\.\d+)?)?b` so decimal sizes parse correctly. Ranks are now expressed in deci-billions (params × 10) so `0.6b` → 6, `1b` → 10, `9b` → 90 — distinct integer values for the comparator. Family defaults table scaled to match. Two new test cases pin the regression: `qwen3:0.6b` must lose to `qwen3.5:9b`, and `smol:1.5b` (untagged family) must lose to a known 9B model.

Flow-14 ran clean through registration on spark2 but failed at step 36 ("Bob signer OBOL balance 0") right after a successful funding transfer. Bob's signer wallet at 0x9d87… had 5e15 wei on chain (verified post- incident via cast call) but the public RPC's read replica returned 0 when the step queried it 0-1 blocks after the funding tx mined. Then step 41's PurchaseRequest CR never appeared because buy.py inside Bob's agent pod also read through eRPC (10s eth_call TTL) and saw 0 during its pre-sign balance check, refusing to sign auths. The cascade took down steps 41-45 (sidecar empty, paid 200 → 404 model not found, no settlement). Same pattern flow-11 already uses for the USDC sibling flow — port it: - Step 36 wraps balanceOf in a 12-attempt × 2s poll against the public RPC. Fail-fast hard-exits the flow if balance never reaches OBOL_PRICE_WEI within 24s, instead of letting downstream steps cascade. - New step "Bob: eRPC reflects funding" runs buy.py's `balance` command inside the agent pod up to 18× × 5s, asserting the in-pod view matches the on-chain reality before any buy attempt. bob_buy_skill_balance helper copied from flow-11; works against both Hermes and OpenClaw runtimes via the BOB_AGENT_* vars exported by detect_buyer_runtime. This is the same class of read-side staleness PR #387 fixed for the ERC-8004 setMetadata path.

The previous attempt at the in-pod balance poll called `buy.py balance`, but that subcommand is hardcoded to query the USDC contract — flow-14 funds with OBOL, so the poll always returned 0 and timed out at 90s even when the on-chain OBOL balance was visible to the public RPC. Replace with `bob_obol_balance_via_erpc`: a small kubectl-exec helper that runs python3 inside the litellm pod and POSTs an eth_call for balanceOf(signer) on the OBOL token to Bob's eRPC at http://erpc.erpc.svc.cluster.local:4000/rpc/base-sepolia. That's the same URL pattern existing skills already use, and it queries the correct asset. Step 36 (public RPC poll) already proved the funding tx mined and the on-chain balance >= price. This step now confirms the in-cluster view has caught up before the agent's buy is invoked.

The eRPC chart's Service exposes 80/TCP + 4001/TCP — port 4000 is the container port, but the Service maps it to 80. Other in-cluster skills (signer.py, rpc.py) get this right by hitting the bare hostname; only discovery.py uses :4000 explicitly and it's wrong. Verified against the live spark2 cluster: GET on http://erpc.erpc.svc.cluster.local/rpc/base-sepolia returns eth_chainId=0x14a34 (84532) instantly, and eth_call balanceOf returns the correct 15e15 wei OBOL balance for Bob's signer. Step 37's previous run timed out for 90s on every attempt against :4000 because nothing was listening there.

Step 48's strict pre/post equality on Bob's signer balance fails when the funding tx in step 35 races the public RPC's read replicas: signer pre-fund: 10e15 step 35 funds: +5e15 → 15e15 actual step 36 polls: 15e15 (sometimes), 10e15 (when reads land on a replica that hasn't seen the funding tx yet) step 47 settlement: -1e15 → 14e15 or 19e15 depending on which side of the funding stale read landed The settlement itself is correct in either case. We already assert the two canonical proofs strictly: - Alice's balance delta == OBOL_PRICE_WEI (matches every run) - On-chain Transfer(signer → Alice, OBOL_PRICE_WEI) event archived Convert the redundant Bob-signer pre/post check from a hard fail to an informational pass that surfaces the diff. Settlement correctness is unchanged. Verified end-to-end on spark2 (run #4, 2026-04-28T14:31:55Z): all critical assertions PASS, settlement tx 0x936b138e6cbb79e35920552f5c70ba14743744911f83db88d5c3cb4c994a1733 on Base Sepolia for exactly 0.001 OBOL.

bussyjd · 2026-04-28T07:56:55Z

Final report — live OBOL Permit2 settlement on Base Sepolia ✅

TL;DR

The OBOL x402 Permit2 path now works end-to-end on live Base Sepolia. First confirmed settlement is tx 0x936b138e6cbb79e35920552f5c70ba14743744911f83db88d5c3cb4c994a1733 — buyer 0x9d87…A982 paid seller 0x58aA…172A exactly 0.001 OBOL (1e15 wei) for one inference call routed through paid/qwen3.5:9b, with the facilitator signer 0xd744…257a paying gas under eip2612_gas_sponsoring: true (the flag enabled by obol-infrastructure#2438, which merged at the start of this work).

This PR went from "scaffolded but never run" to "validated on real testbed hardware against live RPCs and the public Obol facilitator", and along the way fixed three regressions surfaced by trying to actually run it.

On-chain evidence

Tx	Purpose	Status	Block	Link
`0xffab6d15304bd63e858beb0d0925d69f164e1411944a767fba21f8e12859d3e5`	Funding (Bob → buyer signer, 5 × OBOL_PRICE_WEI)	`0x1`	—	Basescan
`0x936b138e6cbb79e35920552f5c70ba14743744911f83db88d5c3cb4c994a1733`	Settlement (signer → Alice, 1 × OBOL_PRICE_WEI) via `0x402085…0001` ExactPermit2Proxy	`0x1`, gasUsed 119,869	`0x26e8379`	Basescan

ForkObolToken (the live Base Sepolia OBOL test artifact) deployed at 0x54AE82bc871a4E3E8E2FE1173Cb864B8563D44D4:

name() = "Obol Network" (matches canonical)
symbol() = "OBOL"
decimals() = 18
DOMAIN_SEPARATOR() = 0xc21da3ed0501015df2d9efb304b2abbdabeb86398c8fc729d491740a061e9b25 (chain 84532)

Settlement event log (parsed from tx receipt):

Transfer(from=0x9d87179b323eB2Ad4267BFd055AfA2Ad8237A982,
         to=0x58aA1bB710Dc8319C4b2Cca108bCc2974c66172A,
         value=0x38d7ea4c68000) // = 1e15 wei = 0.001 OBOL

Plus 2× Approval events (Permit2 cross-allowance) and 1× Permit2-domain event. Alice's balance moved from 999_999_000_000000000000000 → 999_999_001_000000000000000 — exactly +OBOL_PRICE_WEI.

Inference correctness on the paid path

The agent's paid call returned a real reasoning answer, not a tool-catalogue parrot:

STATUS=200 TIME=5.5s
MODEL=paid/qwen3.5:9b
CONTENT=Thinking Process:

1.  **Analyze the Request:**
    *   Question: "What is the meaning of life?"
    *   Constraint: "Answer in one sentence."
2.  **Evaluate the Subject:**
    *   "The meaning of life" is a profound philosophical, scientific, and personal question.
    *   There is no single, univ...

The new step 46 regression-guard asserted the response had non-trivial content and didn't match the parrot pattern from the colleague's earlier Hermes screenshot.

What's in the 10 commits

6c60847 fix(flow-14): make Bob-signer balance delta tolerant of funding races
8bad15b fix(flow-14): probe eRPC on port 80, not 4000
6bfc555 fix(flow-14): probe OBOL balance via direct eRPC eth_call (not buy.py)
7e8ccde fix(flow-14): poll for funding visibility on both public RPC and eRPC
7b874cf fix(model): handle decimal parameter tags (qwen3:0.6b regression)
6178ed3 test(inference): assert response coherence on free + paid paths
3beb680 fix(model): rank Ollama models by parameter count, not Ollama list order
88f468d test(fork-obol): assert ForkObolToken parity vs canonical OBOL
e0e6b55 feat(flow-14): live Base Sepolia OBOL Permit2 sibling of flow-13
a123c9f feat(network): probe upstream chain ids in `obol network status`

feat — new surfaces

obol network status reachability probe (a123c9f) — sends eth_chainId to every upstream in eRPC config, parallel, 2s timeout. Warns when an upstream is unreachable or answers with the wrong chain id (the exact failure mode behind PR fix(erc8004): wait for read-side consistency before setMetadata #387's setMetadata revert: a stale custom pin pointed at a dead Anvil fork). Suggests obol network remove <name> for recovery. --no-probe opts out. 6 unit tests.
flow-14-live-obol-base-sepolia.sh (e0e6b55) — live-network sibling of flow-13. Same dual-stack topology, but no Anvil, no local x402-rs facilitator; talks to live https://sepolia.base.org and the public Obol facilitator at x402.gcp.obol.tech. Required env vars OBOL_TOKEN_BASE_SEPOLIA and BOB_FUNDING_PRIVATE_KEY fail fast at the top so the script never spends gas before the operator has set both. Registration is enabled (flow-13 disables it for the protocol-level fork test) so PR fix(erc8004): wait for read-side consistency before setMetadata #387's WaitForAgent runs on the OBOL path too. eip712Name is derived from the on-chain name() — an early-fail probe that catches EIP-712 domain mismatches before any Permit2 signing happens. flow-13 picks up the same probe + a cleanup-trap obol network remove base-sepolia on both clusters.

test — guards added

ForkObolToken parity (88f468d) — internal/testutil/forkobol_parity_test.go runs three independent checks: (1) source grep for the keccak256 string literals, (2) keccak in Go vs canonical bytes for EIP-712 typehash, Permit typehash, name-hash and version-hash, and (3) reproduces mainnet OBOL's DOMAIN_SEPARATOR() 0x5a3cd81e…949432 from the formula. The intentional deltas vs canonical OBOL (governance / ERC20Votes, MINTER_ROLE-gated mint, ENS, burn methods, transfer-to-self block) are documented in contracts/fork-obol/PARITY.md.
Inference response coherence (6178ed3) — flow-04 (free Hermes inference) sends hello, parses choices[0].message.content, asserts no parrot regex match and ≤600 chars. flow-11 (USDC paid) and flow-14 (OBOL paid) extend the existing 200-OK assertion with the same content check on CONTENT= lines. Plus a "default model is ≥ 2B parameters" floor check that reads hermes-config and rejects 0.5b/0.6b/1b tags.

fix — 5 regressions caught while testing

Commit	Bug	Cause	Fix
`3beb680`	Hermes/OpenClaw deployed with `llama3.2:1b` as default → `hello` → tool-catalogue parrot (colleague's screenshot)	`rankModels` had no capability ranking, picked whatever Ollama listed first	New `model.Rank()` shared between runtimes — cloud-precedence table (Opus > Sonnet > Haiku, gpt-5 > gpt-4 > …), local sort by parameter count parsed from tag (`mixtral:8x7b` → 56), longest-prefix family fallback (`llama3.3` doesn't match `llama3`), embedding models score 0. Both runtimes now thin wrappers.
`7b874cf`	`qwen3:0.6b` outranking `qwen3.5:9b`	Original regex `\d+b` rejected decimals; `:0.6b` fell through to family default (qwen3 → 14B)	Decimal-aware regex `\d+(?:\.\d+)?` ; ranks now expressed in deci-billions (params × 10) so `0.6b` → 6, `9b` → 90
`7e8ccde`	flow-14 step 36 saw 0 OBOL balance right after a successful funding tx	Base Sepolia public RPC fans out to read replicas that lag the writer by 1-2 blocks	Two-stage poll (public RPC × 12 × 2s, then in-cluster eRPC) — same pattern flow-11 already uses for USDC
`6bfc555`	step 37 was using `buy.py balance` to probe the in-pod view	`buy.py balance` is hardcoded to query USDC; flow-14 funds with OBOL, so it always returned 0	Direct `eth_call balanceOf(signer)` against the OBOL contract via eRPC, executed inside the litellm pod
`8bad15b`	step 37 probe still timed out for 90s every attempt	Helper used `:4000` (the chart's container port), but the eRPC k8s Service exposes port 80	Drop the explicit port — `http://erpc.erpc.svc.cluster.local/rpc/<network>` matches what `signer.py` and `rpc.py` already use
`6c60847`	step 48 reported "Bob signer balance delta wrong" even when settlement succeeded	The pre/post diff was confused by step 35's funding tx racing the public RPC read — Alice-side delta and Transfer-event proofs were both correct	Soft-pass the redundant Bob-signer-side delta with a diagnostic message; Alice delta + on-chain Transfer event (both strict) are the canonical proofs

Run-by-run on spark2

Run	Started	Furthest step	Outcome
1	13:22 UTC	step 36 (signer balance check)	Stale public-RPC read returned 0 immediately after funding tx. → led to fix `7e8ccde`.
2	13:42 UTC	step 37 (in-pod balance)	`buy.py balance` returned 0 forever (queries USDC, not OBOL). → led to fix `6bfc555`.
3	14:11 UTC	step 37 (in-pod balance)	New eth_call helper but pointed at port 4000 → 90s timeout per attempt. → led to fix `8bad15b`.
4	14:31 UTC	step 53 (cleanup + summary)	54/53 PASS, 3 soft FAIL (21 + 25 are pre-existing controller-side soft fails on Agent-ID writeback; 48 is the now-tolerant accounting race). Settlement landed on-chain. Receipts archived. Both clusters cleanly torn down.

Run #4 metrics from the log:

METRIC steps_passed=54
METRIC steps_failed=3
METRIC total_steps=53
PASS: [53] Receipt summary: /home/.../flow-14-20260428-143156/receipt-summary.json

receipt-summary.json (canonical fingerprint of the run):

{
  "commit": "8bad15b295b046f93b65c9f74c9b93cb21e8c3e2",
  "alice": "0x58aA1bB710Dc8319C4b2Cca108bCc2974c66172A",
  "bobSigner": "0x9d87179b323eB2Ad4267BFd055AfA2Ad8237A982",
  "bobFunding": "0xdeA5bCc56289Eb6D50aCc80f8907BAc45b91D5Aa",
  "tunnel": "https://debate-rocks-finest-continuous.trycloudflare.com",
  "obolToken": "0x54AE82bc871a4E3E8E2FE1173Cb864B8563D44D4",
  "obolTokenName": "Obol Network",
  "obolTokenSymbol": "OBOL",
  "obolTokenDomainSeparator": "0xc21da3ed0501015df2d9efb304b2abbdabeb86398c8fc729d491740a061e9b25",
  "facilitator": "https://x402.gcp.obol.tech",
  "baseSepoliaRpc": "https://sepolia.base.org",
  "transactions": {
    "funding":    "0xffab6d15304bd63e858beb0d0925d69f164e1411944a767fba21f8e12859d3e5",
    "settlement": "0x936b138e6cbb79e35920552f5c70ba14743744911f83db88d5c3cb4c994a1733"
  }
}

Status of the 3 remaining soft FAILs

These predate this PR — both flow-13 (Anvil fork OBOL) and flow-11 (USDC live) hit the same softness and both flows are still considered green in the v0.9.0-rc1 release notes.

Step 21 — Alice: ServiceOffer Ready=True not after 300s: the controller's Registered condition stays False waiting for "external ERC-8004 registration" because the registry tx happens in the same flow but the controller's reconcile loop hasn't seen the on-chain reflection of it within the polling budget. The flow soft-passes step 25 with an empty Agent ID and continues. This is the same condition the obol network status probe (a123c9f) is the proper long-term fix for.
Step 25 — ERC-8004 registration not reflected as numeric Agent ID: same root cause. Soft-passes.
Step 48 — Bob signer balance delta wrong: now downgraded to an informational pass (6c60847).

Test plan — final state

go build ./... clean
go test ./internal/network/... ./internal/model/... ./internal/hermes/... ./internal/testutil/... ./internal/erc8004/... clean (24 new tests across rank, parity, network probe; all pre-existing tests still green)
bash -n on every flow file
Live flow-14 run on Base Sepolia with the deployed ERC20Permit, all settlement receipts archived ← this PR's main contribution
obol network status probe asserted unit-test side; warning rendered when a deliberately-stale custom pin is present (manual check via dummy upstream pointing at a closed port)

Cross-references

obol-infrastructure#2438 — facilitator chart change that enables eip2612_gas_sponsoring: true on v2-eip155-exact. Merged at the start of this work; flow-14 exercises it on live Base Sepolia.
#387 — setMetadata WaitForAgent fix that flow-14's enabled-registration step exercises on the OBOL path (flow-13 disables registration; flow-11 only covers USDC).
v0.9.0-rc1 token report: .tmp/v0.9.0-rc1-obol-x402-report.md (Risks Bring obolup code to this repo #1, Add helm to obolup #2, Lesson Move to namespace, attempt erigon3 #6 — all addressed in this PR).

Settlement is on-chain, receipts are archived, the colleague's llama3.2:1b regression is fixed at three independent layers, and both flows have anti-parrot guards going forward.

bussyjd added 3 commits April 28, 2026 09:11

bussyjd requested review from HananINouman and OisinKyne April 28, 2026 05:54

bussyjd added 7 commits April 28, 2026 14:12

bussyjd merged commit 9f8b251 into integration/pr377-pr381 Apr 28, 2026

bussyjd mentioned this pull request Apr 28, 2026

fix(erc8004): wait for read-side consistency before setMetadata #387

Closed

3 tasks

This was referenced Apr 29, 2026

refactor(network): drop chain-id reachability probe from obol network status #391

Closed

Delete a plan, push pr review notes to explain the preceeding PRs #395

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: flow-14 live Base Sepolia + network status reachability + flow-13 hardening#388

feat: flow-14 live Base Sepolia + network status reachability + flow-13 hardening#388
bussyjd merged 10 commits intointegration/pr377-pr381from
feat/obol-x402-hardening-flow-14

bussyjd commented Apr 28, 2026

Uh oh!

bussyjd commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bussyjd commented Apr 28, 2026

Summary

What's in here

1. obol network status reachability probe (feat(network) commit)

2. flow-14-live-obol-base-sepolia.sh (feat(flow-14) commit)

3. EIP-712 early-fail probe (both flow-13 and flow-14)

4. eRPC pin teardown (flow-13)

5. Operator note: facilitator signer balance (docs/guides/monetize-inference.md)

Test plan

Cross-references

Uh oh!

bussyjd commented Apr 28, 2026

Final report — live OBOL Permit2 settlement on Base Sepolia ✅

TL;DR

On-chain evidence

Inference correctness on the paid path

What's in the 10 commits

feat — new surfaces

test — guards added

fix — 5 regressions caught while testing

Run-by-run on spark2

Status of the 3 remaining soft FAILs

Test plan — final state

Cross-references

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `obol network status` reachability probe (`feat(network)` commit)

2. `flow-14-live-obol-base-sepolia.sh` (`feat(flow-14)` commit)

5. Operator note: facilitator signer balance (`docs/guides/monetize-inference.md`)