Review giant monetise pr#316
Merged
Merged
Conversation
OisinKyne
added a commit
that referenced
this pull request
Apr 7, 2026
…exing (M3) (#288) * feat: monetize-guide skill + sell probe command (#286) Add monetize-guide skill (SKILL.md + seller-prompt reference) that teaches Claude Code the end-to-end flow: pre-flight checks, model detection, pricing research via ERC-8004 registry, user-confirmed pricing, sell execution, reconciliation monitoring, and endpoint verification. Add `obol sell probe <name> -n <ns>` command that hits the public tunnel URL and verifies the endpoint returns 402 with valid x402 pricing headers. Closes the feedback loop so Claude can confirm a service is live. * fix: approximate per-hour pricing to per-request for x402 gating --per-hour was passing the raw hourly price as the per-request charge (e.g., $0.50/hour charged $0.50 per HTTP request). Now approximates using a 5-minute experiment budget: perRequest = perHour * 5/60. Also rewrites worker_api.py to use stdlib http.server (no Flask dep). * feat: add reth-based ERC-8004 indexer (from #269) * fix: address PR 265 review findings — security, correctness, test coverage Critical: - Fix path traversal in worker_api.py: validate experiment_id with regex - Add GGUF format guard in publish.py before ollama_create() - Run worker container as non-root user in Dockerfile Medium: - Replace manual provenance struct-to-map with JSON round-trip in sell.go - Fix weak test assertion in TestApproximateRequestPriceFromPerHour - Guard int(amount) cast in coordinate.py with try/except - Remove domain-specific default "val_bpb" from CRD metricName field - Guard maxTimeoutSeconds parse in monetize.py Low: - Add provenance propagation test for build_registration_doc - Fix doc type inconsistency in coordination-protocol.md - Assert all 6 Provenance fields in store_test.go # Conflicts: # cmd/obol/sell.go # internal/embed/infrastructure/base/templates/serviceoffer-crd.yaml # internal/embed/skills/autoresearch-coordinator/references/coordination-protocol.md # internal/embed/skills/autoresearch-coordinator/scripts/coordinate.py # internal/embed/skills/autoresearch/SKILL.md # internal/embed/skills/autoresearch/scripts/publish.py # internal/embed/skills/sell/scripts/monetize.py # internal/inference/store_test.go # tests/test_sell_registration_metadata.py * fix(m1): add sell probe command and flow validation scripts - Add `obol sell probe <name> -n <ns>` — sends unauthenticated request through Traefik to verify the endpoint returns 402 with x402 pricing. - Create flow-06-sell-setup.sh: idempotent ServiceOffer creation - Create flow-07-sell-verify.sh: wait for reconciliation + verify conditions - Create flow-10-anvil-facilitator.sh: Anvil fork + x402-rs facilitator - Create flow-08-buy.sh: EIP-712 sign + paid request through Traefik All flows pass end-to-end: sell → verify → pay → inference (HTTP 200). * fix(m2): CRD skills/domains fields + coordinator x402 V1 parsing - Add skills and domains array fields to ServiceOffer CRD registration schema. The --register-skills CLI flag was rejected by strict CRD validation because the fields were missing. - Fix coordinator.py parse_402_pricing to handle x402 V1 standard format (accepts[] array) in addition to flat top-level fields. Validated: worker ServiceOffer → 402 gate → .well-known → discovery → coordinator probe. * fix(oasf): align skill/domain taxonomy with official OASF schema Paths validated against https://github.com/agntcy/oasf (cloned at R&D/oasf/). Skills: - natural_language_processing/text_generation/chat_completion → natural_language_processing/natural_language_generation/text_completion - machine_learning/model_optimization → analytical_skills/model_optimization Domains: - technology/artificial_intelligence → technology/data_science - technology/artificial_intelligence/research → research_and_development/scientific_research Fixed in: monetize.py, coordinate.py, monetize-guide SKILL.md, autoresearch SKILL.md, autoresearch-worker SKILL.md, coordinator references, seller-prompt.md * fix(oasf): use real schema paths — devops_mlops/model_versioning analytical_skills/model_optimization was invented (doesn't exist in OASF). Closest real path: devops_mlops/model_versioning (validated against agntcy/oasf schema repo at R&D/oasf/). Also confirms prior fixes: - natural_language_processing/natural_language_generation/text_completion - technology/data_science - research_and_development/scientific_research * fix: stop stripping /v1 from custom endpoint URLs LiteLLM does NOT auto-append /v1 for openai/ provider routes. WarnAndStripV1Suffix was removing /v1 from api_base before storing in the ConfigMap, causing LiteLLM to hit /chat/completions (404) instead of /v1/chat/completions. Verified against LiteLLM source (gpt_transformation.py:get_complete_url): api_base is passed as-is to the OpenAI SDK base_url, which appends /chat/completions directly. The /v1 must be in api_base. Removed from all 3 call sites: - cmd/obol/model.go (obol model setup custom) - internal/openclaw/openclaw.go (promptForDirectProvider) - internal/openclaw/openclaw.go (promptForCustomProvider) * docs: add LAN resource selling path to monetize-guide Adds a new "External LAN Resources" section to the monetize-guide SKILL.md covering the flow for selling GPU servers or inference endpoints on the local network (e.g., DGX Spark running vLLM). The path: obol model setup custom (bridge into LiteLLM) → obol sell http (create ServiceOffer pointing at LiteLLM). Documents that LAN IPs are reachable from k3d without additional config, and that --endpoint must include /v1 since LiteLLM does not auto-append it. Validated end-to-end with Nemotron 3 Super 120B on 2x DGX Spark: - obol model setup custom validates and adds to LiteLLM - obol sell http creates ServiceOffer - Agent heartbeat reconciles in ~90s → all 6 conditions True - 402 gating works locally and through Cloudflare tunnel - .well-known and /skill.md discovery updated automatically * fix: restore tunnel open/close and host-service logic dropped during cherry-pick Cherry-picking PR #265 into feat/monetize-path had conflicts in sell.go that dropped 3 blocks of code from main: 1. obol sell inference — cluster-aware routing: detects k3d cluster, creates K8s Service+Endpoints bridge to host gateway, creates ServiceOffer, auto-starts tunnel via EnsureTunnelForSell() 2. obol sell http — auto-tunnel: calls EnsureTunnelForSell() after creating a ServiceOffer so the endpoint is immediately public 3. obol sell delete — auto-stop tunnel: when the last ServiceOffer is deleted, stops the quick tunnel and removes the storefront Also restores: - NoPaymentGate field on Deployment and GatewayConfig structs - createHostService(), resolveHostIP(), buildInferenceServiceOfferSpec() - net, runtime, strconv, stack imports * security: pin LiteLLM image to v1.82.3 — supply chain compromise LiteLLM PyPI packages 1.82.7 and 1.82.8 contain a malicious .pth file (litellm_init.pth) that exfiltrates environment variables, SSH keys, cloud credentials, and Kubernetes configs to an external endpoint. See: BerriAI/litellm#24512 Our template used the floating tag `main-stable` which could pull a compromised build. Pin to `main-v1.82.3` (confirmed safe, matches the version currently running in our clusters). Never use floating tags for security-sensitive dependencies. * fix: restore /skill.md publishing dropped during cherry-pick The cherry-pick of PR #265 dropped _build_skill_md() and _publish_skill_md() from monetize.py, along with their 3 call sites in cmd_process. This meant /skill.md would never be created or updated on a fresh cluster. Restores: - _build_skill_md(): generates service catalog markdown from Ready offers - _publish_skill_md(): creates/updates ConfigMap + Deployment + Service + HTTPRoute for the /skill.md endpoint - 3 call sites in cmd_process: 1. Empty skill.md when no offers exist 2. Full skill.md when all offers are Ready 3. Regenerate after reconciliation loop * test: add tunnel lifecycle test coverage No tests existed for tunnel state persistence or auto-stop decision logic — this is why the cherry-pick drift went undetected. New tests in tunnel_lifecycle_test.go: - State round-trip (save/load, quick & dns modes) - Missing state file returns (nil, nil) - State overwrite replaces previous - File permissions (0600 for non-secret metadata) - UpdatedAt timestamp refresh on save - tunnelModeAndURL derivation - shouldAutoStopTunnel decision logic (5 cases covering the logic from sell delete: stop quick tunnels when empty, never stop dns) - Exported LoadTunnelState wrapper * chore: add golangci-lint v2 config with gofumpt formatting (#290) * chore: update CLAUDE.md conventions and fix stale justfile commands - Add Conventions section (conventional commits, branch prefixes, skill reference) - Trim discoverable sections (Key Packages table, file-path tables, Embedded assets) - Keep full LLM Routing, Security, Pitfalls sections - Fix justfile: obol cluster → obol stack (stale command name) * chore: add golangci-lint v2 config and PostToolUse format hook Aligned with ObolNetwork/charon lint stance, adapted for obol-stack: - Disabled linters not applicable to CLI apps (wrapcheck, forbidigo, noctx, testpackage) - Enabled gofumpt + goimports formatters - Added PostToolUse hook to auto-lint Go files on Write|Edit Many linters are disabled with TODO comments for incremental re-enablement. * style: apply gofumpt and goimports formatting across codebase Auto-fix from golangci-lint v2 with gofumpt + goimports formatters. Also removes unused code: sellInfoCommand (cmd/obol/sell.go), pubKeyBytes (internal/tee/attest_stub.go), and fixes dogsled/thelper/govet warnings. * security: re-enable gosec linter and fix all issues - Tighten WriteFile permissions from 0644 to 0600 across 13 files (config, keys, kubeconfig — no reason for group/world read on local CLI) - Exclude G204 (subprocess) and G101 (credential naming) — systemic false positives for a CLI tool that intentionally runs k3d/helm/kubectl - Exclude gosec from internal/testutil/ (test-only helpers) - Nolint G704 (SSRF), G705 (XSS), G122 (TOCTOU), G703 (path traversal) with explanations — all false positives for local config operations * chore: re-enable 10 linters and fix all issues Re-enabled: errcheck, errchkjson, gocritic, nilerr, nilnil, nosprintfhostport, staticcheck, unconvert, wastedassign. Fixes: - errcheck: add _ = for best-effort calls (browser open, process kill, kubectl display, io.Copy to discard) - nilerr: annotate intentional fallback patterns (missing config files, optional features, backward compat defaults) - staticcheck: lowercase error strings (ST1005), rename hasId→hasID (ST1003), nolint deprecated elliptic.Marshal (SA1019) - gocritic: fix appendAssign slice mutation bugs (copy before append), nolint CGo dupSubExpr false positives, exitAfterDefer in main - unconvert: remove 5 unnecessary int() wrappers on cmd.Int() - nosprintfhostport: use net.JoinHostPort in testutil - wastedassign: use var declaration instead of empty init Still disabled: goconst (32), unparam (28) — low priority. * chore: re-enable goconst and unparam — all linters now active - Extract provider name constants (ProviderOllama, ProviderAnthropic, ProviderOpenAI) into internal/model and use across model/openclaw - Extract API key env var constants (envAnthropicAPIKey, envOpenAIAPIKey) - Extract OS name constants in dns/resolver, update status messages in update package, version unknownValue constant - Add isDevMode() helper in config to eliminate repeated "true" comparison - Simplify buildLocalManagedSecretYAML: remove always-nil error return and unused hostname parameter - Exclude goconst and unparam from test files (test fixtures, mock handler signatures) - Nolint validateInstallOptions networkName — extensibility param All linters from .golangci.yml are now active with zero issues. * fix: address PR review feedback on lint config and hook - Set fix: false in .golangci.yml — lint runs are now non-mutating by default, suitable for CI. Use --fix flag explicitly when wanted. - Hook: use golangci-lint via PATH instead of hardcoded ~/.local/bin/ - Hook: use `golangci-lint fmt "$f"` to scope formatting to the edited file only, avoiding unrelated rewrites across the package --------- Co-authored-by: bussyjd <bussyjd@users.noreply.github.com> * refactor: move monetize reconciliation into serviceoffer controller * test: validate PR299 seller buyer and discovery flows * fix: publish discovery assets from controller * refactor: enforce singleton registration ownership * fix: clean merged registration schema * fix: exclude .workspace from Docker build context When OBOL_DEVELOPMENT=true, Docker builds from the project root pick up .workspace/data/ directories that contain root-owned PVC mounts from previous clusters, causing "permission denied" errors during context scanning. Exclude .workspace/ and .worktrees/ from the Docker build context via .dockerignore. Fixes #304 * fix: harden facilitator flow and prewarm dev images * feat: buy-side hardening and openclaw Docker fallback - Add balance guard to buy.py — exits on insufficient USDC unless --force, tells the agent which wallet to fund - Add end-to-end "Full Buy Flow" section to buy-inference SKILL.md covering discover → probe → buy → use paid/<model> → monitor/refill - Add "After Discovery" cross-reference from discovery SKILL.md to buy-inference - Add Docker image fallback for openclaw CLI install in obolup.sh for npm-deprived environments (ARM64 DGX Spark validated) Validated full buyer→seller x402 commerce loop on DGX Spark (Base Sepolia): seller registers on ERC-8004, obol-agent discovers via registry, probes 402, pre-signs ERC-3009 auths, paid/qwen3.5:9b routes through x402-buyer sidecar. * fix: use k3d pull-through caches in dev mode * fix: stageDefaultSkills merges new skills into existing deployments Previously, stageDefaultSkills skipped entirely if the skills/ directory already existed. This meant skills added to the binary after initial deployment (e.g. buy-inference, discovery) were never staged on re-sync, requiring a manual `obol openclaw skills sync --from` workaround. Now always runs CopySkills, which writes embedded files without deleting user-added content. Validated on DGX Spark (ARM64): obol openclaw sync correctly re-stages all 21 skills into an existing deployment, and the full x402 commerce loop (probe → buy 5 auths → 5/5 paid inference → exact depletion on 6th request) passes. * fix: harden serviceoffer controller reconciliation * fix: review cleanup — reduce diff noise and fix shadowed builtins - Rename `copy` → `patched` in controller.go to avoid shadowing the built-in copy function (4 occurrences) - Revert cosmetic whitespace changes in kubectl.go and verifier.go that added review noise without functional benefit - Add ApplyOutput to kubectl.go cleanly atop the original formatting - Remove SetRegistration/HandleWellKnown from verifier (now handled by serviceoffer-controller registration resources) - Update CLAUDE.md dev constraints to mention serviceoffer-controller image https://claude.ai/code/session_016ktZi3zLxbnWrjF8cLyTjm * fix: x402 verifier TLS certs + consumer key for Base Sepolia USDC Three fixes for cross-machine paid inference via facilitator.x402.rs: 1. x402-verifier CA certificates (x402.yaml): The verifier container image lacks a system CA bundle, causing TLS failures when connecting to facilitator.x402.rs: x509: certificate signed by unknown authority Added a ca-certificates ConfigMap (populated at deploy time from the host cert store) mounted at /etc/ssl/certs in the verifier pod. 2. Consumer key: Hardhat #9 -> Hardhat #0 (lib.sh, flow-08-buy.sh): Base Sepolia USDC (0x036CbD53) rejects ERC-3009 transferWithAuthorization signatures from Hardhat accounts #1-#9 (FiatTokenV2: invalid signature). Only HH#0 and custom keys pass validation. Root cause unclear but consistently reproducible. Tested: Real on-chain USDC settlement on Base Sepolia confirmed. TX: 0xa95cc751a3cd010acde7814719db0f539a53829849c4aba05ebb624009e20a4f Seller +0.001 USDC, Buyer -0.001 USDC. * sec: derive test keys from mnemonic at runtime + gitleaks pre-commit - Replace all hardcoded Hardhat private keys with runtime derivation from the well-known test mnemonic via cast wallet derive-private-key - flow-08-buy.sh reads CONSUMER_PRIVATE_KEY from env (set by lib.sh) - flow-10 derives FACILITATOR_SIGNER_KEY via hh_key() helper - Add gitleaks pre-commit hook to block accidental secret commits - Add .gitleaks.toml allowlist for the public test mnemonic Zero private key material in source. Keys are derived at runtime from: test test test test test test test test test test test junk * chore: extract reth-erc8004-indexer from monetize PR Remove the standalone Rust reth-erc8004-indexer and its Dockerfile from this PR. The indexer is not wired into any Helm chart, CLI command, or infrastructure template — it's dead weight in a +23K line PR. Tracked separately for integration as its own PR once the wiring into `obol network install ethereum --indexer` is designed. * fix: wait for verifier readiness probe after restart in BDD tests waitForVerifierRestart only checked pod logs for the new facilitator URL, but the pod could log it before its HTTP listener was ready to serve requests. This caused transient 500s in the next scenario when the verifier was still initializing. Add kubectl rollout status after the restart to block until the pod passes its /readyz probe (returns 200 only after pricing config is loaded). The log check remains as a secondary confirmation with a shorter timeout. * Review giant monetise pr (#316) * Remove indexer that's supposedly not used. Will reintroduce as part of obol network install later * Claudes fixes * Rename probe to test, commit transcript * Update cmd/obol/sell.go Signed-off-by: Oisín Kyne <4981644+OisinKyne@users.noreply.github.com> --------- Signed-off-by: Oisín Kyne <4981644+OisinKyne@users.noreply.github.com> Co-authored-by: bussyjd <bussyjd@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Oisín Kyne <4981644+OisinKyne@users.noreply.github.com>
Closed
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I removed the indexer stuff because claude thinks it wasn't used.
I renamed
probetotest.I asked claude to find issues and let it make almost all of them.
I include
docs/pr-review-monetize-path.mdso you can follow my journey on this PR review in a scalable manner (and to highlight anywhere claude was wrong and that we need to unscrew up).I let it remove the ralph files because i think you have the other PR with all the flows and gherkin stuff which has superseeded your ralph loop if i'm not mistaken.
If you're happy, lets get this merged to yours, into main, and also merge your draft PR that improves logging etc.