integration: validate tunnel onboarding with live OBOL faucet flow by bussyjd · Pull Request #452 · ObolNetwork/obol-stack

bussyjd · 2026-05-08T13:26:02Z

Summary

What changed:

hardens the integrated tunnel/live-OBOL branch’s smoke and live-flow harnesses after the Docker reinstall
prefers obol-native workspace teardown (stack down / stack purge) and preserves cached stack IDs so k3d fallback cleanup still works when purge removes config
makes the public tunnel verification fail closed instead of silently passing when the tunnel probe is unreachable
keeps the flow-13 eRPC pinning path dependency-free from PyYAML and now fails clearly if ruby is unavailable
keeps the spark2-backed gemma4-fast path green across the integrated smoke, live Base Sepolia OBOL, and forked-OBOL flows

Why it matters:

exact-HEAD 13ba63c17b701fafe42606501125e309768da9bb is now green for the full smoke suite, including flow-11, flow-14, and flow-13
the validation story is now aligned with the pushed commit instead of a pre-commit worktree run
repeated QA reruns stop leaking stale workspaces/clusters from the stack-id cleanup path that regressed during the Docker reset cycle

Risk level: medium

Commit under test:

13ba63c17b701fafe42606501125e309768da9bb

Base branch:

main

Scope

Validation

CI checks:

Check	Status	Link
`lint-test`	PASS	https://github.com/ObolNetwork/obol-stack/actions/runs/25597706304/job/75146356680
`Analyze (actions)`	PASS	https://github.com/ObolNetwork/obol-stack/actions/runs/25597705705/job/75146356004
`Analyze (go)`	PASS	https://github.com/ObolNetwork/obol-stack/actions/runs/25597705705/job/75146356006
`Analyze (javascript-typescript)`	PASS	https://github.com/ObolNetwork/obol-stack/actions/runs/25597705705/job/75146356008
`Analyze (python)`	PASS	https://github.com/ObolNetwork/obol-stack/actions/runs/25597705705/job/75146356005

Pre-commit / local correctness checks:

commit=13ba63c17b701fafe42606501125e309768da9bb
$ bash -n flows/lib.sh flows/flow-07-sell-verify.sh flows/flow-10-anvil-facilitator.sh flows/flow-11-dual-stack.sh flows/flow-13-dual-stack-obol.sh flows/flow-14-live-obol-base-sepolia.sh
PASS

$ python3 -m py_compile internal/embed/skills/discovery/scripts/discovery.py
PASS

$ git diff --check
PASS

$ independent diff review (cleanup + fail-closed tunnel checks)
PASS

Exact-head release smoke:

commit=13ba63c17b701fafe42606501125e309768da9bb
$ RELEASE_SMOKE_INCLUDE_OBOL=true \
  RELEASE_SMOKE_INCLUDE_OBOL_FORK=true \
  FLOW13_BASE_SEPOLIA_RPC=https://base-sepolia-rpc.publicnode.com \
  OBOL_LLM_ENDPOINT=http://192.168.0.24:18080/v1 \
  OBOL_LLM_MODEL=gemma4-fast \
  DOCKER_CONFIG=<temporary empty config.json> \
  bash flows/release-smoke.sh
PASS

Inline smoke report summary:

Flow	Result	FAIL lines	SKIP lines	Exit code
`flow-01-prerequisites`	PASS	0	0	0
`flow-02-stack-init-up`	PASS	0	0	0
`flow-03-inference`	PASS	0	0	0
`flow-04-agent`	PASS	0	0	0
`flow-05-network`	PASS	0	0	0
`flow-06-sell-setup`	PASS	0	0	0
`flow-07-sell-verify`	PASS	0	0	0
`flow-10-anvil-facilitator`	PASS	0	0	0
`flow-08-buy`	PASS	0	0	0
`flow-09-lifecycle`	PASS	0	0	0
`flow-11-dual-stack`	PASS	0	0	0
`flow-14-live-obol-base-sepolia`	PASS	0	0	0
`flow-13-dual-stack-obol`	PASS	0	0	0

Artifacts from the exact-head run:

report: .tmp/release-smoke-20260509-165243/RELEASE_REPORT.md
flow-11 receipts: .tmp/release-smoke-20260509-165243/flow-11-receipts/receipt-summary.json
flow-14 receipts: .tmp/release-smoke-20260509-165243/flow-14-receipts/receipt-summary.json
flow-13 receipts: .tmp/release-smoke-20260509-165243/flow-13-receipts/receipt-summary.json

Live Chain Evidence

Network:

Base Sepolia (84532)

RPC/provider:

https://base-sepolia-rpc.publicnode.com

Facilitator:

public: https://x402.gcp.obol.tech
local fork flow: http://127.0.0.1:53788

Contracts and tokens:

Name	Address	Notes
ERC-8004 Identity Registry	`0x8004A818BFB912233c491871b3d84c89A494BD9e`	Base Sepolia registry
Live OBOL token	`0x0a09371a8b011d5110656ceBCc70603e53FD2c78`	`Obol Network / OBOL / 18 decimals`
Forked OBOL token	`0x210BBd033630e5e611B7922D70b0Caabe64636d9`	deployed during exact-head `flow-13`
Permit2 router	`0x000000000022D473030F116dDEE9F6B43aC78BA3`	approval target used in live/fork OBOL flows

Wallet roles:

Role	Address
Alice / seller / register	`0xC0De030F6C37f490594F93fB99e2756703c4297E`
Bob / buyer / payer / signer	`0x57b0eF875DeB5A37301F1640E469a2129Da9490E`

Exact-head transaction evidence:

`flow-11-dual-stack` exact-head evidence

agent id: 5702
tunnel: https://pottery-arms-horses-tall.trycloudflare.com
registration tx: 0x844bb9d8179571aca3f53fd95b5ba33cd4c972538c84a54138cbfdf0ee37604c
metadata tx: 0xc2a9a72bed2d7cd8311839b4c803e4d950a89704997af017bd76cdbdc774f48d
settlement tx: 0x651c44cab864ffed001a3fb089a1198fff7b4e04c1093fe7f4ee86fcf5a6ad71

`flow-14-live-obol-base-sepolia` exact-head evidence

agent id: 5703
tunnel: https://statute-allen-leaf-runs.trycloudflare.com
OBOL token: 0x0a09371a8b011d5110656ceBCc70603e53FD2c78
registration tx: 0xd183bb1ecd2993b87afe72e47e266b5b98f34091dc30d73c061a3d6e30917ee1
metadata tx: 0xa919f4b20b9fcfc0b00efd3b3d0c406bbf44ce7066db0489145f5ecf83d43b4f
settlement tx: 0xa192904a6c415b30cf908de500ff8c8330724b14601cbb9112181a2146deb576
Alice OBOL delta: 7000000000000000 -> 8000000000000000 wei (+1000000000000000)
Bob signer OBOL delta: 4993000000000000000 -> 4992000000000000000 wei (-1000000000000000)

`flow-13-dual-stack-obol` exact-head evidence

tunnel: https://catering-solid-night-several.trycloudflare.com
forked OBOL funding tx: 0xef2d85e801191599dec7ed3790bc74dd7b1c1f9c7f4f63c80b36e01254334582
forked OBOL settlement tx: 0x4e476edc29b0576aff44c48ca39889a9bffa38fc626d7087f00e8ff9637cf8b7
Alice OBOL delta: 10000000000000000000 -> 10001000000000000000 wei (+1000000000000000)
Bob signer OBOL delta: 10000000000000000000 -> 9999000000000000000 wei (-1000000000000000)

Runtime Evidence

QA environment:

Item	Value
OS / arch	macOS arm64
Backend	k3d / k3s on Docker Desktop
Tooling	Python `3.11.14`, Go `1.25.5`, GitHub CLI `2.86.0`, Docker `29.4.2`, kubectl client `v1.35.3`
QA model route	external LiteLLM endpoint backed by spark2 `gemma4-fast`
Resilience detail	exact-head smoke used an auto-restarting local SSH forward through `spark1 -> 192.168.100.11:8000` so the spark2 endpoint stayed available across Cloudflare SSH flap events

Model and paid-route evidence:

paid/gemma4-fast returned HTTP 200 with coherent content in flow-11, flow-14, and flow-13
live OBOL flow settled exactly 1000000000000000 wei (0.001 OBOL) from Bob signer to Alice seller
forked OBOL flow settled exactly 1000000000000000 wei (0.001 OBOL) from Bob signer to Alice seller

Post-run cleanup state:

QA k3d clusters: none left running
local anvil processes from flow-13: stopped by the flow
local facilitator container from the smoke harness was manually removed after the run (docker rm -f obol-flow10-x402-facilitator) and is called out below as a remaining cleanup follow-up

Review Notes

Known gaps:

flow-10 / smoke cleanup still left a helper container (obol-flow10-x402-facilitator) after the exact-head run; I removed it manually after validation. This PR materially improves stale-workspace / cluster cleanup, but there is still one remaining facilitator-container cleanup follow-up.
shellcheck still reports several pre-existing warnings in the flow harness outside the paths touched here.

Reviewer focus:

flows/lib.sh cached stack-id cleanup path
flows/flow-07-sell-verify.sh fail-closed public tunnel eRPC verification
flows/flow-13-dual-stack-obol.sh explicit runtime prereq handling for the YAML patch helper
exact-head smoke report and the three receipt summaries under .tmp/release-smoke-20260509-165243/

…rent main

- prefer current Go path for nohup/cron flow execution - poll for remote-signer pod creation before age checks - allow cooldown-safe flow-15 reruns when Bob already has faucet OBOL - poll post-claim balances to tolerate public RPC state lag

…in' into integration/pr450-pr451-cloudflare-obol

bussyjd · 2026-05-09T09:33:41Z

Exact-head validation is now aligned with the pushed commit 13ba63c17b701fafe42606501125e309768da9bb.

Local validation summary:

full release-smoke PASS at exact HEAD
flow-11-dual-stack PASS
flow-14-live-obol-base-sepolia PASS
flow-13-dual-stack-obol PASS

The PR body now includes the exact-head smoke matrix and the live/fork OBOL receipt hashes from .tmp/release-smoke-20260509-165243/.

One remaining cleanup follow-up I observed during the exact-head run: the smoke harness still left obol-flow10-x402-facilitator running after completion, so I removed it manually after validation. CI analysis jobs are still pending on the newly pushed commit.

bussyjd · 2026-05-09T09:34:33Z

Exact-head validation is now aligned with the pushed commit 13ba63c17b701fafe42606501125e309768da9bb.

Local validation summary:

full release-smoke PASS at exact HEAD
flow-11-dual-stack PASS
flow-14-live-obol-base-sepolia PASS
flow-13-dual-stack-obol PASS

The PR body now includes the exact-head smoke matrix and the live/fork OBOL receipt hashes from .tmp/release-smoke-20260509-165243/.

One remaining cleanup follow-up I observed during the exact-head run: the smoke harness still left obol-flow10-x402-facilitator running after completion, so I removed it manually after validation. CI analysis jobs are still pending on the newly pushed commit.

bussyjd · 2026-05-09T09:35:07Z

Final status:

exact-head local smoke is green on 13ba63c17b701fafe42606501125e309768da9bb
all GitHub PR checks are now passing
PR body has been refreshed with the exact-head smoke matrix and current receipt hashes

Remaining noted follow-up: the smoke harness still left obol-flow10-x402-facilitator running after the exact-head run, so I removed it manually after validation and called that out in the PR body.

Co-authored-by: bussyjd <bussyjd@users.noreply.github.com>

* feat(buy): add `obol buy inference` host CLI Mirrors `obol sell inference` on the buyer side. The host CLI handles default-seller resolution, ERC-8004 identity pre-flight, and USDC->micro-units conversion, then dispatches to the existing `buy.py buy` skill in the obol-agent pod. Single canonical wallet, no host-side keystore. - internal/x402/setup.go: DefaultBuySellerURL, DefaultBuySellerAgentID, DefaultBuySellerChain placeholders (TODO: wire live values once the default seller is provisioned). - internal/agentruntime/exec.go: ExecInPod + BuildExecArgs generalize the kubectl-exec helper that was hardcoded to the hermes binary. - internal/hermes/hermes.go: cliViaKubectlExec + hermesExecArgs delegate to the new agentruntime helpers; existing test stays valid. - internal/buy/discover.go: .well-known/agent-registration.json fetcher and ERC-8004 agentId verification (hard-fail on mismatch). - cmd/obol/buy.go: `obol buy inference [<name>] --seller --model --budget --expected-agent-id --no-verify-identity --auto-refill ...`. * test(flow-11): validate host buy inference on integration

bussyjd · 2026-05-09T13:15:04Z

Update: integration branch integration/pr450-pr451-cloudflare-obol now includes the host-buy CLI work from #434 via merge commit f594cef9e5e4a2e2bd47613de5a49a330935f4c7.

Host-buy validation used the exact pre-merge integrated head d0cb941f3941191d53287fd7e5a7757896e482c3 and passed.

Targeted checks

bash -n flows/flow-11-dual-stack.sh
go test ./cmd/obol ./internal/buy ./internal/agentruntime ./internal/hermes ./internal/x402 ./internal/erc8004 -count=1
go vet ./cmd/obol ./internal/buy ./internal/agentruntime ./internal/hermes ./internal/x402 ./internal/erc8004
git diff --check

Live flow evidence

Ran flows/flow-11-dual-stack.sh with:

OBOL_LLM_ENDPOINT=http://192.168.0.24:18080/v1
OBOL_LLM_MODEL=gemma4-fast
temporary empty DOCKER_CONFIG

Key assertions that passed:

Alice registered ERC-8004 agent 5707
Bob discovered Alice via the registry
host-side obol buy inference path executed successfully
PurchaseRequest became Ready with budget 1000 micro-USDC and 5 auths
buyer sidecar showed exactly 5 remaining auths
paid inference succeeded via paid/gemma4-fast
on-chain settlement succeeded

Exact-head receipts:

registration tx: 0x233a1f12d1d9c2bf5eb742ffbd8c81ca7577655f953a35fe88d8b884a24a3464
metadata tx: 0x29b48d900829f6b4b650c986d6414ba188a896f1a1ce326175a20ce0892141d2
settlement tx: 0x91af8cf4aacb0fdd100e7bd1b125b5e693c93fc326cdf343104a3cd67fba14c6

Flow result:

Dual-stack test complete: 52/50 passed
paid inference reply: OBOL payment smoke test passed.

Artifacts from that exact-head run:

.tmp/flow-11-20260509-204453/receipt-summary.json

* Agent crd * Next phase * 1, 2a, 2b, 2c, 4a, 4b, 5, 6, 7, 8, 9 * 2d * Update with almost all complete, time for testing * Bug fixing * chore: remove stray runtime log * chore(flows): renumber sell-agent smoke flow for integration * fix(agent): harden CRD update sync semantics --------- Co-authored-by: bussyjd <bussyjd@users.noreply.github.com> Co-authored-by: bussyjd <jd@obol.tech>

Both versions were intended to land via the integration branch behind PR #452 but did not make it through the squash merges. Aligning main with the latest published tags. - frontend: v0.1.21-rc1 → v0.1.23 (real release, off the rc) - hermes-agent: v2026.4.30 → v2026.5.7 - justfile dev-frontend-reset target: v0.1.19 → v0.1.23

Pulls forward five small correctness fixes that were carried on the integration branch behind #452 but did not survive the squash merges. - Re-queue offers when their referenced Agent changes. Without this an Agent status edit (e.g. status.pinnedModel after the user edits spec.model) never propagates into the offer's status.agentResolution because the offer reconciler only runs when the offer itself changes. - Refuse to Update Namespace and PersistentVolumeClaim during applyAgentObject. PVCs reject wholesale Update with "spec is immutable after creation", and the controller's RBAC only grants `create` on Namespaces. Treat existence as success for these kinds and move on; mutable kinds (ConfigMap, Secret, Deployment, Service, ServiceAccount) keep going through the normal Update path. - Fall back to status.agentResolution.Model in the storefront catalog when an offer's spec.model is empty (the canonical state for type=agent offers, where the model lives on the linked Agent). - Bump the serviceoffer-controller Deployment memory request from 64Mi to 128Mi and the limit from 256Mi to 512Mi. The Agent informer + agent reconciler + in-controller keystore generation pushed steady-state past 256Mi after #453 and triggered OOMKilled restart loops. - Set GATEWAY_ALLOW_ALL_USERS=true on CRD-rendered agent pods. CRD agents only expose the API (gated by API_SERVER_KEY + ForwardAuth); no Telegram/Discord/dashboard platforms are wired. The flag silences Hermes' user-gateway startup warning without opening any real surface.

…odel pin Pulls forward three dev-experience improvements from the integration branch behind #452 that did not survive the squash merges. - Selective image rebuild via OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES. The variable now accepts a comma-separated list of image short names (e.g. `x402-verifier,serviceoffer-controller`) in addition to the existing `true`/`all` and `false`/`0`/unset behaviours. The full image set is x402-verifier, serviceoffer-controller, x402-buyer, demo-server, and obol-stack-public-storefront (with `public-storefront` accepted as an alias). Saves a full ~10-minute rebuild when only one image changed. - Claude Code plugin install tip on stack up. After `obol stack up`, if the `claude` CLI is present but the ObolNetwork/skills marketplace or its plugin isn't installed, surface a one-line install hint. Reads ~/.claude/plugins/{known_marketplaces,installed_plugins}.json best-effort; silently no-ops on any error so a malformed Claude config can never block stack up. - Auto-pin a model on the agent-backed demo. `obol sell agent --demo` resolves the first non-`paid/*` model from the cluster's LiteLLM config (the same source `obol model list` reads) and writes it into the rendered Agent's spec.model so the controller doesn't park at ModelUnpinned. Returns a clear "configure a model first" error if the cluster has nothing usable, and removes a stale "depend on step 2d" caveat that no longer applies. Docs updated in CLAUDE.md, .agents/skills/obol-stack-dev/SKILL.md, and .agents/skills/obol-stack-dev/references/dev-environment.md.

…image Verified locally against ghcr.io/obolnetwork/remote-signer:v0.3.0: - Main's KEYSTORE_PASSWORD env name is unrecognised; the binary exits with Error: NoPassword on startup. - Main's keystore dir /keystores conflicts with the image's default /data/keystores (declared as a volume in the image config). - Main's /health readiness probe returns HTTP 404; the binary only serves /healthz, which returns {"status":"ok"}. Together these mean any Agent CR with wallet.create=true on main has a remote-signer that crash-loops or fails liveness, blocking the agent from ever reaching Ready. This is what the integration branch behind #452 was carrying. Pulling it forward: - Move keystore dir to /data/keystores (the image default), and pin the on-disk filename to keystore.json so the Secret volume projection no longer needs to thread the V3 UUID through; the V3 document carries the address internally so the cosmetic filename doesn't matter. - Add ensureCanonicalKeystoreKey migration helper: on reconcile of an existing Secret with the wallet annotation, if data is keyed under the old UUID-named JSON field, rewrite it as keystore.json in-place. Refuses ambiguous Secrets with multiple legacy JSON keys. - Switch env scheme to upstream's SIGNER__SECTION__KEY hierarchy (SIGNER__SERVER__HOST, SIGNER__SERVER__PORT, SIGNER__KEYSTORE__DIR, SIGNER__KEYSTORE__PASSWORD, SIGNER__LOGGING__FORMAT/LEVEL). Matches the master agent's working config in hermes-obol-agent. - Switch readiness and liveness probes from /health to /healthz. Adds 8 unit tests covering fresh keystore creation, reuse, legacy key migration, ambiguity rejection, malformed data, and the canonical Secret/Deployment shape (single keystore.json projected, password read via env, never mounted).

`resolveAssetTermsFor` returned `--token X is not available on chain Y (supported tokens: OBOL, USDC)` when a token wasn't registered for the requested chain. The "supported tokens" list came from the global registry (`SupportedTokens()`), not from the chain, so operators reading the error saw OBOL listed as supported even though the lookup just failed on `base-sepolia`/`base`/etc. This was actively misleading. Surfaced today on spark2 while wiring `obol sell inference … --token OBOL --chain base-sepolia` — the binary (v0.9.0) rejected OBOL on base-sepolia (registry entry added in #452 after the release was cut), but the message claimed OBOL was supported. Changes: - Add `TokensOnChain(chain)` and `ChainsForToken(token)` helpers in internal/x402/tokens.go so callers can ask the registry chain-scoped questions without iterating it themselves. - Rewrite the error in `resolveAssetTermsFor` to use both: `--token OBOL is not available on chain base-sepolia; tokens on base-sepolia: OBOL, USDC; OBOL is registered on: base-sepolia, ethereum` with four branches covering the chain-empty, token-empty, both-empty, and normal cases. - Add table-driven tests covering the helpers (chains/tokens lookups, aliases, unknown chain/token, case-insensitive token names). Co-authored-by: bussyjd <bussyjd@users.noreply.github.com>

bussyjd and others added 12 commits May 8, 2026 12:34

[verified] feat(tunnel): rebuild persistent tunnel/domain flow on cur…

cabcb0d

…rent main

feat: add live OBOL faucet buyer-seller smoke flow

9770f4d

test(flows): tolerate summarized sell status output and dynamic ingress

29ac39c

[verified] fix(tunnel): harden persistent handoff and setup UX

729422b

test(flows): assert tunnel does not expose erpc

7f7fb0e

docs: align live OBOL source of truth

9e02619

feat(tunnel): add configurable cloudflared transport

f2d1187

fix: harden live OBOL faucet smoke reruns

fb13062

- prefer current Go path for nohup/cron flow execution - poll for remote-signer pod creation before age checks - allow cooldown-safe flow-15 reruns when Bob already has faucet OBOL - poll post-claim balances to tolerate public RPC state lag

Merge remote-tracking branch 'origin/feat/tunnel-domain-onboarding-ma…

2282f95

…in' into integration/pr450-pr451-cloudflare-obol

test(flows): add portable timeout fallback for live obol registration

9cd47d9

fix(flows): harden smoke tunnel and OBOL buyer paths

b9eb5a5

fix(flows): fail closed on tunnel smoke checks

13ba63c

This was referenced May 9, 2026

feat: add live OBOL faucet buyer-seller smoke flow #450

Closed

feat(tunnel): add guided persistent tunnel and domain onboarding #451

Closed

feat: improve persistent Cloudflare tunnel setup and domain onboarding #429

Closed

bussyjd and others added 2 commits May 9, 2026 19:32

refactor dual-stack flow helpers (#454)

783e34d

Co-authored-by: bussyjd <bussyjd@users.noreply.github.com>

bussyjd merged commit 8467a8d into main May 9, 2026
6 checks passed

This was referenced May 12, 2026

fix(x402): scope token-availability error to the requested chain #469

Merged

fix(sell inference): three bugs surfaced by OBOL-priced run on Linux Docker #470

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integration: validate tunnel onboarding with live OBOL faucet flow#452

integration: validate tunnel onboarding with live OBOL faucet flow#452
bussyjd merged 15 commits into
mainfrom
integration/pr450-pr451-cloudflare-obol

bussyjd commented May 8, 2026 •

edited

Loading

Uh oh!

bussyjd commented May 9, 2026

Uh oh!

bussyjd commented May 9, 2026

Uh oh!

bussyjd commented May 9, 2026

Uh oh!

bussyjd commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bussyjd commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scope

Validation

Live Chain Evidence

flow-11-dual-stack exact-head evidence

flow-14-live-obol-base-sepolia exact-head evidence

flow-13-dual-stack-obol exact-head evidence

Runtime Evidence

Review Notes

Uh oh!

bussyjd commented May 9, 2026

Uh oh!

bussyjd commented May 9, 2026

Uh oh!

bussyjd commented May 9, 2026

Uh oh!

bussyjd commented May 9, 2026

Targeted checks

Live flow evidence

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bussyjd commented May 8, 2026 •

edited

Loading

`flow-11-dual-stack` exact-head evidence

`flow-14-live-obol-base-sepolia` exact-head evidence

`flow-13-dual-stack-obol` exact-head evidence