Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .agents/skills/obol-stack-dev/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ OBOL_TOKEN_BASE_SEPOLIA=0x0a09371a8b011d5110656ceBCc70603e53FD2c78

**Payment assertion**: don't bypass the agent buy step with a direct script exec. If the agent times out, diagnose Hermes/LiteLLM/model routing — don't relax the assertion. Required evidence: `PurchaseRequest Ready=True` + paid HTTP 200 + on-chain `Transfer` + exact balance deltas.

**QA LLM**: full seller/buyer QA must route Alice and Bob through `OBOL_LLM_ENDPOINT` (OpenAI-compatible vLLM or llama.cpp on the QA host). Default `OBOL_LLM_MODEL=qwen36-fast`. Sequence: `obol model setup custom` → `obol model prefer` → one `obol model sync`. Local Ollama and cloud-fallback are **not** acceptable green substitutes for full-flow QA.
**QA LLM**: full seller/buyer QA must route Alice and Bob through `OBOL_LLM_ENDPOINT` (OpenAI-compatible vLLM or llama.cpp on the QA host). Default `OBOL_LLM_MODEL=qwen36-deep` (27B-class). The smaller `qwen36-fast` (~4B) was the previous default but flakes on the long single-shot agent-buy prompt at flow-13/14 step 46 — see the retry-wrapper rationale in `flows/lib-dual-stack.sh::agent_buy_with_retry`. Sequence: `obol model setup custom` → `obol model prefer` → one `obol model sync`. Local Ollama and cloud-fallback are **not** acceptable green substitutes for full-flow QA.

**Public vs private routes**: `/services/*`, `/.well-known/agent-registration.json`, `/skill.md`, and `/` (storefront) are public via the tunnel. **NEVER** remove `hostnames: ["obol.stack"]` from frontend or eRPC HTTPRoutes — exposing them publicly is a critical security flaw.

Expand Down
4 changes: 2 additions & 2 deletions .agents/skills/obol-stack-dev/references/llm-routing.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,15 @@ Canonical user flow for vLLM / sglang / mlx-lm / a remote GPU box. **No ConfigMa
obol stack up

# Drop auto-detected Ollama entries — they will out-rank the new custom entry.
# Internal/model/rank.go parses ":9b" as 90 deci-billions; "qwen36-fast" (no
# Internal/model/rank.go parses ":9b" as 90 deci-billions; "qwen36-deep" (no
# ":Nb" tag) ranks 0. Without removing them, the agent stays on slow host Ollama.
obol model remove qwen3.5:9b
obol model remove qwen3.5:4b

obol model setup custom \
--name spark1-vllm \
--endpoint http://192.168.18.23:8000/v1 \
--model qwen36-fast
--model qwen36-deep
# `setup custom` validates the endpoint, patches LiteLLM, and internally calls
# syncAgentModels → hermes.Sync → rewrites the default agent's deployment files
# with the new primary model. No manual restart needed.
Expand Down
2 changes: 1 addition & 1 deletion .agents/skills/obol-stack-dev/references/paid-flows.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ The runner has a `warn_unpaid_base_sepolia_rpc` preflight. The CLI scrubs paid-R
- Alice ServiceOffer reaches `Ready=True`.
- ERC-8004 registration tx published to Base Sepolia (`/.well-known/agent-registration.json` reachable via tunnel for live flows).
- Bob `PurchaseRequest` reaches `Ready=True`.
- LiteLLM exposes `paid/<OBOL_LLM_MODEL>` (default `qwen36-fast`).
- LiteLLM exposes `paid/<OBOL_LLM_MODEL>` (default `qwen36-deep`).
- Paid inference returns HTTP 200 and **final-answer** content (not reasoning metadata or tool-catalogue text).
- On-chain `Transfer(Bob signer → Alice, <PAID_AMOUNT>)` receipt is archived.
- Alice balance increases and Bob signer balance decreases by exactly `PAID_AMOUNT` wei (USDC for flow-11, OBOL for flow-13/14).
Expand Down
2 changes: 1 addition & 1 deletion .agents/skills/obol-stack-dev/references/remote-qa.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ Set `OBOL_LLM_MODEL` to an id returned by `/models`.
cd "$QA"
export PATH="$QA/.workspace/bin:$FOUNDRY_BIN:$TOOL_ROOT:$PATH"
export OBOL_LLM_ENDPOINT=${OBOL_LLM_ENDPOINT:-http://127.0.0.1:8000/v1}
export OBOL_LLM_MODEL=${OBOL_LLM_MODEL:-qwen36-fast}
export OBOL_LLM_MODEL=${OBOL_LLM_MODEL:-qwen36-deep}
ts=$(date +%Y%m%d-%H%M%S)
log="$QA/.tmp/flow-14-$ts.log"
art="$QA/.tmp/flow-14-$ts-artifacts"
Expand Down
6 changes: 6 additions & 0 deletions .gitleaks.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,12 @@ regexes = [
'''test test test test test test test test test test test junk''',
# USDC storage slot values (uint256 padded, not secrets)
'''0x0{50,}[0-9a-fA-F]{1,14}''',
# Shell variable expansion in HTTP Auth headers — the actual secret
# comes from $BOB_TOKEN / $LITELLM_KEY / etc. at runtime, not from
# the literal source text. Matches `Authorization: Bearer $VAR` and
# `Authorization: Basic ${VAR}` forms only; a hardcoded literal still
# trips the rule because the allowlist regex requires a literal `$`.
'''Authorization:\s+(?:Basic|Bearer)\s+\$\{?[A-Za-z_][A-Za-z0-9_]*''',
]
paths = [
# Gitleaks own config
Expand Down
6 changes: 3 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ go test -tags integration -v -run TestIntegration_Tunnel_SellDiscoverBuySidecar_

# Release-gate seller/buyer smoke (requires OBOL_LLM_ENDPOINT pointing at OpenAI-compatible vLLM/llama.cpp)
RELEASE_SMOKE_INCLUDE_OBOL=true RELEASE_SMOKE_INCLUDE_OBOL_FORK=true \
OBOL_LLM_ENDPOINT=http://127.0.0.1:8000/v1 OBOL_LLM_MODEL=qwen36-fast \
OBOL_LLM_ENDPOINT=http://127.0.0.1:8000/v1 OBOL_LLM_MODEL=qwen36-deep \
bash flows/release-smoke.sh

just up # obol stack init + up
Expand Down Expand Up @@ -246,13 +246,13 @@ obol model remove qwen3.5:4b
obol model setup custom \
--name spark1-vllm \
--endpoint http://192.168.18.23:8000/v1 \
--model qwen36-fast
--model qwen36-deep
# `setup custom` validates the endpoint, patches LiteLLM, and internally calls
# syncAgentModels → hermes.Sync → rewrites the default agent's deployment files
# with the new primary model. No manual restart needed.

# (b) OR keep Ollama and force-promote the custom entry to the head:
obol model prefer qwen36-fast
obol model prefer qwen36-deep
obol model sync # propagate to Hermes

obol model list # confirm head of model_list
Expand Down
4 changes: 2 additions & 2 deletions flows/buy-external.sh
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@
# EXTERNAL_PR_TIMEOUT_S default: 300 (5 min)
# EXTERNAL_LOG_BLOCKS_BACK default: 30 (~6 min on Base Sepolia at 2s/blk)
# OBOL_LLM_ENDPOINT default: http://127.0.0.1:8000/v1
# OBOL_LLM_MODEL default: qwen36-fast
# OBOL_LLM_MODEL default: qwen36-deep (27B-class)
# OBOL_LLM_NAME default: external-llm
#
# Exit code: 0 on PASS (every step pass), 1 on any FAIL.
Expand Down Expand Up @@ -106,7 +106,7 @@ EXTERNAL_PR_TIMEOUT_S="${EXTERNAL_PR_TIMEOUT_S:-300}"
EXTERNAL_LOG_BLOCKS_BACK="${EXTERNAL_LOG_BLOCKS_BACK:-30}"

OBOL_LLM_ENDPOINT="${OBOL_LLM_ENDPOINT:-http://127.0.0.1:8000/v1}"
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-fast}"
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-deep}"
OBOL_LLM_NAME="${OBOL_LLM_NAME:-external-llm}"

# Resolve OBOL_ROOT before sourcing helpers — lib.sh re-derives it but
Expand Down
2 changes: 1 addition & 1 deletion flows/flow-03-inference.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ source "$(dirname "$0")/lib.sh"

if [ -n "${OBOL_LLM_ENDPOINT:-}" ]; then
run_step "Route LiteLLM through QA LLM endpoint" route_llm_via_obol_cli "$OBOL"
LITELLM_MODEL="${OBOL_LLM_MODEL:-qwen36-fast}"
LITELLM_MODEL="${OBOL_LLM_MODEL:-qwen36-deep}"
else
LITELLM_MODEL="$FLOW_MODEL"

Expand Down
4 changes: 2 additions & 2 deletions flows/flow-04-agent.sh
Original file line number Diff line number Diff line change
Expand Up @@ -110,8 +110,8 @@ fi

model_name=$("$OBOL" kubectl get cm hermes-config -n "$NS" -o jsonpath='{.data.config\.yaml}' 2>/dev/null | sed -n 's/^[[:space:]]*default: //p' | tr -d '"' | head -1)
[ -n "$model_name" ] || model_name="qwen3.5:35b"
if [ -n "${OBOL_LLM_ENDPOINT:-}" ] && [ "$model_name" != "${OBOL_LLM_MODEL:-qwen36-fast}" ]; then
fail "Hermes default model $model_name does not match QA LLM model ${OBOL_LLM_MODEL:-qwen36-fast}"
if [ -n "${OBOL_LLM_ENDPOINT:-}" ] && [ "$model_name" != "${OBOL_LLM_MODEL:-qwen36-deep}" ]; then
fail "Hermes default model $model_name does not match QA LLM model ${OBOL_LLM_MODEL:-qwen36-deep}"
cleanup_pid "$PF_PID"
emit_metrics
exit 0
Expand Down
4 changes: 2 additions & 2 deletions flows/flow-11-dual-stack.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
# FLOW11_BOB_HTTP_PORT FLOW11_BOB_HTTP_ALT_PORT
# FLOW11_BOB_HTTPS_PORT FLOW11_BOB_HTTPS_ALT_PORT
# OBOL_LLM_ENDPOINT required vLLM/llama.cpp/OpenAI-compatible endpoint
# OBOL_LLM_MODEL endpoint model name (default: qwen36-fast)
# OBOL_LLM_MODEL endpoint model name (default: qwen36-deep)
source "$(dirname "$0")/lib.sh"

# ═════════════════════════════════════════════════════════════════
Expand All @@ -60,7 +60,7 @@ BOB_HTTP_ALT_PORT="${FLOW11_BOB_HTTP_ALT_PORT:-$(pick_free_port)}"
BOB_HTTPS_PORT="${FLOW11_BOB_HTTPS_PORT:-$(pick_free_port)}"
BOB_HTTPS_ALT_PORT="${FLOW11_BOB_HTTPS_ALT_PORT:-$(pick_free_port)}"
FACILITATOR_URL="${FLOW11_FACILITATOR_URL:-https://x402.gcp.obol.tech}"
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-fast}"
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-deep}"
export OBOL_LLM_MODEL
FLOW11_ARTIFACT_DIR="${FLOW11_ARTIFACT_DIR:-$OBOL_ROOT/.tmp/flow-11-$(date +%Y%m%d-%H%M%S)}"
if ! BASE_SEPOLIA_RPC="$(resolve_base_sepolia_rpc "${FLOW11_BASE_SEPOLIA_RPC:-${BASE_SEPOLIA_RPC:-}}")"; then
Expand Down
36 changes: 6 additions & 30 deletions flows/flow-13-dual-stack-obol.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
# FLOW13_BOB_HTTP_PORT, _ALT, _HTTPS_PORT, _HTTPS_ALT_PORT
# FLOW13_ARTIFACT_DIR where receipts + logs land
# OBOL_LLM_ENDPOINT required vLLM/llama.cpp/OpenAI-compatible endpoint
# OBOL_LLM_MODEL endpoint model name (default: qwen36-fast)
# OBOL_LLM_MODEL endpoint model name (default: qwen36-deep, 27B-class)
#
source "$(dirname "$0")/lib.sh"
DUAL_STACK_FLOW_PREFIX="FLOW13"
Expand All @@ -61,7 +61,7 @@ BOB_HTTP_ALT_PORT="$(dual_stack_env_or_free_port BOB_HTTP_ALT_PORT)"
BOB_HTTPS_PORT="$(dual_stack_env_or_free_port BOB_HTTPS_PORT)"
BOB_HTTPS_ALT_PORT="$(dual_stack_env_or_free_port BOB_HTTPS_ALT_PORT)"

OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-fast}"
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-deep}"
export OBOL_LLM_MODEL

ANVIL_PORT="${FLOW13_ANVIL_PORT:-$(pick_free_port)}"
Expand Down Expand Up @@ -899,31 +899,7 @@ pass "Agent discovery prompt issued (success will be confirmed by buy + Purchase
# ═════════════════════════════════════════════════════════════════

step "Bob's agent: buy 5 OBOL Permit2 auths from Alice"
buy_response=$(curl -sf --max-time 300 \
-X POST "http://localhost:${BOB_AGENT_PORT}/v1/chat/completions" \
-H "Authorization: Bearer $BOB_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"$BOB_AGENT_RUNTIME-agent\",
\"messages\": [{
\"role\": \"user\",
\"content\": \"Use the buy-x402 skill and your terminal tool. Run exactly once: ERPC_URL=http://erpc.erpc.svc.cluster.local/rpc ERPC_NETWORK=base-sepolia python3 $BOB_OBOL_SKILLS_DIR/buy-x402/scripts/buy.py buy alice-obol --endpoint $TUNNEL_URL/services/alice-obol-inference/v1/chat/completions --model $OBOL_LLM_MODEL --count 5\"
}],
\"max_tokens\": 4000,
\"stream\": false
}" 2>&1 || true)
buy_content=$(extract_assistant_content "$buy_response" 2>/dev/null || true)
echo "${buy_content:0:500}"
# Don't grep buy_content for natural-language confirmation; structural success
# is the PurchaseRequest CR Ready=True poll below.
if [ -z "$(printf '%s' "$buy_content" | tr -d '[:space:]')" ]; then
echo " ! Agent returned no final assistant text; confirming purchase via PurchaseRequest CR"
fi
if printf '%s' "$buy_content" | agent_response_refused; then
fail "Agent refused to run buy.py: ${buy_content:0:500}"
emit_metrics; exit 1
fi
pass "Agent buy prompt issued (success will be confirmed by PurchaseRequest CR)"
agent_buy_with_retry

# ═════════════════════════════════════════════════════════════════
# 36-39. PR Ready / LiteLLM rollout / sidecar auths / paid call
Expand All @@ -947,9 +923,9 @@ buyer_status=$(buyer_sidecar_status)
# Mirror flow-14's relaxed assertion. Two reasons to allow remaining>=5
# rather than exact-5: (a) controller may merge into an existing auth
# pool on rerun (remaining=10 etc.); (b) the agent prompt asks for
# --count 5, but qwen36-fast occasionally hallucinates --count 1, which
# is an LLM-stochasticity issue not a buy-flow correctness issue. We
# only care that the buy step actually provisioned at least the
# --count 5, but the LLM occasionally hallucinates a different count,
# which is an LLM-stochasticity issue not a buy-flow correctness issue.
# We only care that the buy step actually provisioned at least the
# requested count.
remaining_n=$(echo "$buyer_status" | grep -oE 'remaining=[0-9]+' | head -1 | cut -d= -f2)
if [ -n "$remaining_n" ] && [ "$remaining_n" -ge 5 ] 2>/dev/null; then
Expand Down
28 changes: 3 additions & 25 deletions flows/flow-14-live-obol-base-sepolia.sh
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
# FLOW14_ARTIFACT_DIR where receipts + logs land
# FLOW14_BOB_GAS_MIN_WEI default: 100000000000000
# OBOL_LLM_ENDPOINT required vLLM/llama.cpp/OpenAI-compatible endpoint
# OBOL_LLM_MODEL endpoint model name (default: qwen36-fast)
# OBOL_LLM_MODEL endpoint model name (default: qwen36-deep, 27B-class)
#
# Usage:
# ./flows/flow-14-live-obol-base-sepolia.sh
Expand Down Expand Up @@ -74,7 +74,7 @@ BOB_HTTP_ALT_PORT="$(dual_stack_env_or_free_port BOB_HTTP_ALT_PORT)"
BOB_HTTPS_PORT="$(dual_stack_env_or_free_port BOB_HTTPS_PORT)"
BOB_HTTPS_ALT_PORT="$(dual_stack_env_or_free_port BOB_HTTPS_ALT_PORT)"

OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-fast}"
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-deep}"
export OBOL_LLM_MODEL

# Live Base Sepolia RPC + public Obol facilitator. No host.k3d.internal pin.
Expand Down Expand Up @@ -953,29 +953,7 @@ pass "Agent discovery prompt issued (success will be confirmed by buy + Purchase
# ═════════════════════════════════════════════════════════════════

step "Bob's agent: buy 5 OBOL Permit2 auths from Alice"
buy_response=$(curl -sf --max-time 300 \
-X POST "http://localhost:${BOB_AGENT_PORT}/v1/chat/completions" \
-H "Authorization: Bearer $BOB_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"$BOB_AGENT_RUNTIME-agent\",
\"messages\": [{
\"role\": \"user\",
\"content\": \"Use the buy-x402 skill and your terminal tool. Run exactly once: ERPC_URL=http://erpc.erpc.svc.cluster.local/rpc ERPC_NETWORK=base-sepolia python3 $BOB_OBOL_SKILLS_DIR/buy-x402/scripts/buy.py buy alice-obol --endpoint $TUNNEL_URL/services/alice-obol-inference/v1/chat/completions --model $OBOL_LLM_MODEL --count 5\"
}],
\"max_tokens\": 4000,
\"stream\": false
}" 2>&1 || true)
buy_content=$(extract_assistant_content "$buy_response" 2>/dev/null || true)
echo "${buy_content:0:500}"
if [ -z "$(printf '%s' "$buy_content" | tr -d '[:space:]')" ]; then
echo " ! Agent returned no final assistant text; confirming purchase via PurchaseRequest CR"
fi
if printf '%s' "$buy_content" | agent_response_refused; then
fail "Agent refused to run buy.py: ${buy_content:0:500}"
emit_metrics; exit 1
fi
pass "Agent buy prompt issued (success will be confirmed by PurchaseRequest CR)"
agent_buy_with_retry

# ═════════════════════════════════════════════════════════════════
# 31-34. PR Ready / LiteLLM rollout / sidecar auths / paid call
Expand Down
82 changes: 82 additions & 0 deletions flows/lib-dual-stack.sh
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,88 @@ except Exception as e:
" 2>&1 || true
}

# Send the long single-shot buy prompt to Bob's agent. The prompt expands
# against the caller's environment (BOB_AGENT_PORT, BOB_TOKEN,
# BOB_AGENT_RUNTIME, BOB_OBOL_SKILLS_DIR, TUNNEL_URL, OBOL_LLM_MODEL).
_agent_buy_send_prompt() {
curl -sf --max-time 300 \
-X POST "http://localhost:${BOB_AGENT_PORT}/v1/chat/completions" \
-H "Authorization: Bearer $BOB_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"$BOB_AGENT_RUNTIME-agent\",
\"messages\": [{
\"role\": \"user\",
\"content\": \"Use the buy-x402 skill and your terminal tool. Run exactly once: ERPC_URL=http://erpc.erpc.svc.cluster.local/rpc ERPC_NETWORK=base-sepolia python3 $BOB_OBOL_SKILLS_DIR/buy-x402/scripts/buy.py buy alice-obol --endpoint $TUNNEL_URL/services/alice-obol-inference/v1/chat/completions --model $OBOL_LLM_MODEL --count 5\"
}],
\"max_tokens\": 4000,
\"stream\": false
}" 2>&1 || true
}

_agent_buy_pr_exists() {
bob kubectl get purchaserequests.obol.org -n "$BOB_AGENT_NS" alice-obol \
-o name 2>/dev/null | grep -q .
}

# 1-retry wrapper for the agent buy prompt at flow-13/14 step 46. The QA LLM
# (qwen36-deep, 27B-class — see OBOL_LLM_MODEL default) occasionally narrates a
# fabricated failure on the long single-shot buy prompt instead of actually
# invoking the bash tool. When that happens, no PurchaseRequest is created and
# step 47 fails with "PurchaseRequest CR not ready" — even though buy.py was
# never invoked. The smaller qwen36-fast (~4B) flakes much more often; deep is
# the new default for that reason. See plans/inference-v1337-followup-20260514.md.
#
# Strategy: poll for the PR for up to 60s after the first prompt; if absent,
# print a LOUD warning flagging this as agent unreliability and re-send the
# prompt once. If still absent after the retry, step 47 fails as before.
agent_buy_with_retry() {
local response content retried=0 i

response=$(_agent_buy_send_prompt)
content=$(extract_assistant_content "$response" 2>/dev/null || true)
echo "${content:0:500}"
if [ -z "$(printf '%s' "$content" | tr -d '[:space:]')" ]; then
echo " ! Agent returned no final assistant text; confirming purchase via PurchaseRequest CR"
fi
if printf '%s' "$content" | agent_response_refused; then
fail "Agent refused to run buy.py: ${content:0:500}"
emit_metrics; exit 1
fi

# Wait up to 60s for the controller to reconcile the PR. Healthy runs see
# it within ~5s; the long ceiling absorbs cluster-cold-start jitter.
for i in $(seq 1 12); do
_agent_buy_pr_exists && break
sleep 5
done

if ! _agent_buy_pr_exists; then
echo ""
echo " ╔════════════════════════════════════════════════════════════════════════╗"
echo " ║ WARN: agent did NOT create a PurchaseRequest after 60s. ║"
echo " ║ Documented LLM flake on the long single-shot buy prompt — agent ║"
echo " ║ narrated a fabricated failure instead of invoking buy.py. ║"
echo " ║ Re-prompting ONCE. ║"
echo " ║ If this fires regularly: confirm OBOL_LLM_MODEL=qwen36-deep (default) ║"
echo " ║ not qwen36-fast (4B), or escalate to qwen36-35b-heretic, or add a ║"
echo " ║ non-agent fallback path. ║"
echo " ║ Ref: plans/inference-v1337-followup-20260514.md ║"
echo " ╚════════════════════════════════════════════════════════════════════════╝"
echo ""
retried=1
response=$(_agent_buy_send_prompt)
content=$(extract_assistant_content "$response" 2>/dev/null || true)
echo " RETRY response: ${content:0:500}"
if printf '%s' "$content" | agent_response_refused; then
fail "Agent refused to run buy.py on retry: ${content:0:500}"
emit_metrics; exit 1
fi
fi

pass "Agent buy prompt issued (retry=$retried; success will be confirmed by PurchaseRequest CR)"
}

extract_assistant_content() {
DUAL_STACK_RESPONSE="$1" python3 - <<'PY'
import json
Expand Down
Loading
Loading