Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions .agents/skills/obol-stack-dev/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,12 +133,21 @@ locally-tagged `ghcr.io/obolnetwork/<name>:latest` and your source change
won't reach the running pod):

```bash
# Rebuild everything
OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES=true obol stack up

# Rebuild only the image(s) you changed — much faster
OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES=x402-verifier obol stack up
OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES=serviceoffer-controller,x402-buyer obol stack up
```

Applies to every image in `baseLocalImages` (x402-verifier,
serviceoffer-controller, x402-buyer, demo-server, public-storefront).
The warm-path summary line surfaces this hint when nothing was rebuilt.
Values: `true`/`all` → rebuild every image; comma-separated short names →
rebuild only those; `false`/`0`/unset → reuse all cached images (default).
Short name is the image base without the registry prefix or tag
(e.g. `x402-verifier` from `ghcr.io/obolnetwork/x402-verifier:latest`).
Images: x402-verifier, serviceoffer-controller, x402-buyer, demo-server,
obol-stack-public-storefront (`public-storefront` alias accepted). The
warm-path summary line surfaces this hint when nothing was rebuilt.

Integration checks:

Expand Down
31 changes: 31 additions & 0 deletions .agents/skills/obol-stack-dev/references/dev-environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,37 @@ obol kubectl cluster-info
obol kubectl get namespaces
```

### Forcing image rebuilds

`obol stack up` (with `OBOL_DEVELOPMENT=true`) reuses any locally-tagged
`ghcr.io/obolnetwork/<name>:latest` image to keep warm runs fast. Use
`OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES` to override that for specific images:

```bash
# Rebuild everything
OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES=true obol stack up

# Rebuild only the image you changed — avoids the full 10-minute rebuild
OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES=x402-verifier obol stack up
OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES=serviceoffer-controller,x402-buyer obol stack up
```

| Value | Effect |
|-------|--------|
| unset / `false` / `0` | Reuse all cached images (default) |
| `true` / `all` | Rebuild every local image |
| `img1,img2` | Rebuild only the named images |

The short name is the image base without registry prefix or tag
(`x402-verifier` from `ghcr.io/obolnetwork/x402-verifier:latest`).

Available image names: `x402-verifier`, `serviceoffer-controller`,
`x402-buyer`, `demo-server`, `obol-stack-public-storefront`
(`public-storefront` alias accepted).

When nothing was rebuilt the "Local dev images ready" summary line prints
the selective-rebuild hint so you know the option is available.

The cluster runs:
- k3d (Kubernetes in Docker) with 1 server + 3 agent nodes
- Traefik (ingress controller)
Expand Down
31 changes: 31 additions & 0 deletions .github/workflows/renovate.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Renovate

on:
schedule:
- cron: "0 * * * *" # every hour
workflow_dispatch:
inputs:
dry_run:
description: "Dry-run (no PRs opened)"
required: false
default: "false"

concurrency:
group: renovate
cancel-in-progress: false

jobs:
renovate:
name: Renovate
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- name: Run Renovate
uses: renovatebot/github-action@a4df0f50ee02c2fc7b4b8f8aa4a3ff6929fa4fc1 # v46.1.13
env:
LOG_LEVEL: debug
RENOVATE_DRY_RUN: ${{ github.event.inputs.dry_run == 'true' && 'full' || '' }}
with:
token: ${{ secrets.RENOVATE_TOKEN }}
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ Key code: `cmd/x402-buyer/`, `internal/x402/buyer/`, and `internal/x402/forwarda
1. **Absolute paths required** — Docker volume mounts need absolute paths (resolved at `obol stack init`)
2. **Two-stage templating** — Stage 1 (CLI flags) → Stage 2 (Helmfile) separation is critical
3. **Unique namespaces** — each deployment must have unique namespace
4. **`OBOL_DEVELOPMENT=true`** — required for `obol stack up` to auto-build local images (x402-verifier, serviceoffer-controller, x402-buyer, demo-server, public-storefront). The build path reuses any locally-tagged image of the same name to keep warm runs fast; pass `OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES=true` alongside it to force `docker build` for every image regardless of what's already in the local daemon. The "Local dev images ready" summary line surfaces this hint when nothing was rebuilt this run.
4. **`OBOL_DEVELOPMENT=true`** — required for `obol stack up` to auto-build local images (x402-verifier, serviceoffer-controller, x402-buyer, demo-server, obol-stack-public-storefront; `public-storefront` alias accepted). The build path reuses any locally-tagged image of the same name to keep warm runs fast; set `OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES` to control force-rebuilds: `true`/`all` rebuilds every image, a comma-separated list (e.g. `x402-verifier,serviceoffer-controller`) rebuilds only those images, and `false`/`0`/unset skips forced rebuilds. The "Local dev images ready" summary line surfaces this hint when nothing was rebuilt this run.
5. **Root-owned PVCs** — `-f` flag required to remove in `obol stack purge`
6. **Narrow review boundaries** — for controller/RBAC/payment changes, spell out exact security and user-journey invariants before editing or delegating; broad review prompts have previously produced noisy findings and missed test drift

Expand Down
38 changes: 37 additions & 1 deletion cmd/obol/sell_agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import (
"github.com/ObolNetwork/obol-stack/internal/config"
"github.com/ObolNetwork/obol-stack/internal/hermes"
"github.com/ObolNetwork/obol-stack/internal/kubectl"
"github.com/ObolNetwork/obol-stack/internal/model"
"github.com/ObolNetwork/obol-stack/internal/schemas"
"github.com/ObolNetwork/obol-stack/internal/tunnel"
"github.com/ObolNetwork/obol-stack/internal/ui"
Expand Down Expand Up @@ -310,7 +311,18 @@ func runAgentBackedDemo(
if _, err := kubectlApplyOutput(cfg, nsManifest); err != nil {
return fmt.Errorf("apply namespace: %w", err)
}
// Pin a model up front so the controller doesn't park at
// ModelUnpinned. The demo is meant to be one-shot; surfacing
// LiteLLM-empty as a clear error here is better than letting the
// agent silently never reach Ready.
demoModel, modelErr := resolveDefaultAgentModel(cfg)
if modelErr != nil {
return fmt.Errorf("resolve a default model for the demo agent: %w", modelErr)
}
u.Infof("Pinning demo agent to model %q (cluster top-of-rank)", demoModel)

manifest := agentcrd.BuildAgent(agentName, agentcrd.AgentOptions{
Model: demoModel,
Skills: spec.Agent.Skills,
Objective: spec.Agent.Objective,
CreateWallet: true,
Expand Down Expand Up @@ -398,12 +410,36 @@ func runAgentBackedDemo(
}

u.Blank()
u.Dim("Note: agent-backed demos depend on serviceoffer-controller step 2d for in-cluster Hermes provisioning. Until that lands the offer will park at Ready=False until you provision the agent's pod manually.")
printDemoTryIt(u, name, typeName, price, symbol, chain, tunnelURL, false)

return nil
}

// resolveDefaultAgentModel picks a model to pin onto a fresh Agent CR.
// Walks the cluster's LiteLLM model_list (the same source `obol model
// list` reads), drops the meta `paid/*` route, and returns the top
// entry. The list is already in the operator's preferred order via
// `obol model prefer`, so "first non-paid" is a meaningful default.
//
// Returns an error if the cluster has no usable models — the caller
// turns this into a clear "configure a model first" message rather than
// silently picking nothing.
func resolveDefaultAgentModel(cfg *config.Config) (string, error) {
configured, err := model.GetConfiguredModels(cfg)
if err != nil {
return "", err
}
for _, name := range configured {
// Skip the paid/* meta route — it's a buyer-side namespace, not
// a model the agent can actually run inference on.
if strings.HasPrefix(name, "paid/") {
continue
}
return name, nil
}
return "", fmt.Errorf("no usable LiteLLM model configured; run `obol model setup` or pull an Ollama model first")
}

// agentRefForSale is what we need to know about the referenced Agent CR
// to assemble a coherent ServiceOffer: namespace (where the offer goes
// alongside the agent), wallet address (default revenue recipient),
Expand Down
8 changes: 6 additions & 2 deletions internal/embed/infrastructure/base/templates/x402.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -288,10 +288,14 @@ spec:
resources:
requests:
cpu: 25m
memory: 64Mi
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
# Bumped from 256Mi after adding the Agent informer + agent
# reconciler — the additional shared cache + workqueue + the
# in-controller keystore generation pushed steady-state past
# 256Mi and triggered OOMKilled restart loops.
memory: 512Mi

---
apiVersion: v1
Expand Down
2 changes: 1 addition & 1 deletion internal/hermes/hermes.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ const (
// renovate: datasource=helm depName=raw registryUrl=https://bedag.github.io/helm-charts/
rawChartVersion = "2.0.2"

defaultImage = "nousresearch/hermes-agent:v2026.4.30"
defaultImage = "nousresearch/hermes-agent:v2026.5.7"
// Use the upstream image venv instead of cloning Hermes into the PVC on
// every cold start. The init container below validates the required extras
// are present so image regressions fail before the gateway starts.
Expand Down
25 changes: 25 additions & 0 deletions internal/serviceoffercontroller/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -384,6 +384,16 @@ func (c *Controller) applyAgentObject(ctx context.Context, resource dynamic.Reso
return err
}

// Some kinds we only ever create — never reshape after the fact.
// Namespaces are the canonical example: the host CLI creates the
// namespace before applying the Agent CR (since the CR is namespaced
// and can't land otherwise), and the controller's RBAC intentionally
// only grants `create`, not `update`, to keep the blast radius small.
// Treat existence as success for these and move on.
if isCreateOnlyKind(desired.GetKind()) {
return nil
}

// Preserve resourceVersion + uid so Update doesn't 409. Spec/data is
// rewritten in full from `desired`; that's the controller's contract.
desired.SetResourceVersion(existing.GetResourceVersion())
Expand All @@ -397,6 +407,21 @@ func (c *Controller) applyAgentObject(ctx context.Context, resource dynamic.Reso
return err
}

// isCreateOnlyKind returns true for kinds that the controller refuses to
// Update on subsequent reconciles. Either the Update would require
// broader RBAC than we want (Namespace), or the resource has immutable
// spec fields that reject any wholesale Update (PVC's
// `spec is immutable after creation`). Mutable kinds (ConfigMap, Secret
// data, Deployment, Service ports, ServiceAccount) keep going through
// the normal Update path so reconciles still pick up rendered changes.
func isCreateOnlyKind(kind string) bool {
switch kind {
case "Namespace", "PersistentVolumeClaim":
return true
}
return false
}

// resourceFor maps an unstructured.Unstructured to the dynamic resource
// client for its GVK. Centralising the mapping keeps provisionAgent's
// loop generic and means a future kind addition is a one-line case.
Expand Down
2 changes: 1 addition & 1 deletion internal/serviceoffercontroller/agent_render.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ const (
hermesAPISecret = "hermes-api-server"
hermesDataPVC = "hermes-data"
hermesAPIPath = "/health"
defaultHermesImage = "nousresearch/hermes-agent:v2026.4.30"
defaultHermesImage = "nousresearch/hermes-agent:v2026.5.7"
)

// agentLabels returns the standard label set we attach to every primitive
Expand Down
Loading