[deployment] Cost guardrail threshold-cross alert hook (webhook / callback)

## Background

Cost guardrail warnings fire at the configured thresholds (default 50% / 80% of daily / monthly cap). The warning is logged as a JSONL record in `<agent>/log/YYYY-MM/YYYY-MM-DD.jsonl` with severity INFO/WARN (verified — `atomic_agents/agent.py:1707-1714`).

JSONL log is the only delivery channel. There is no webhook hook, no callback, no integration point for an external alerter (Telegram bot, Slack webhook, PagerDuty, email).

## Why it matters

The reason warnings exist is to give the operator time to react before the cap blocks runs. Burying them in a JSONL file the operator reads only when they remember to check defeats the purpose.

For a personal-deployment use case (Dan's gizmo), Telegram alerting is non-negotiable — silent cron failures and silent threshold-crosses are both known operational pain. For a SaaS use case, alerting needs to route per tenant.

## What to change

1. **Config schema** — add optional `alert_hooks` block to `model.md` `cost_guardrails`:
   ```yaml
   cost_guardrails:
     daily_cap_usd: 5.00
     monthly_cap_usd: 100.00
     warning_thresholds: [0.50, 0.80]
     alert_hooks:
       - type: webhook
         url: https://hooks.slack.com/...
         on: [threshold_cross, cap_blocked]
       - type: webhook
         url: https://api.telegram.org/bot.../sendMessage
         on: [threshold_cross, cap_blocked]
         template: "atomic-agents alert: {agent} at {pct}% of {period} cap"
   ```
2. **Runtime** — `_fire_cost_warning()` in `agent.py` walks `alert_hooks`, posts to each:
   - JSONL log entry as today (don't regress)
   - Webhook POST with JSON payload `{agent, period, pct, threshold, severity, ts}`
   - Failure handling: alerter timeout/error logged, doesn't block the agent run
3. **Library hook** (orthogonal) — `AtomicAgent` accepts an optional `on_cost_alert: Callable[[CostAlert], None]` parameter for programmatic use. Hub-wrapped invocations register their own callable; bare-CLI uses the YAML config.
4. **Spec doc update** — `docs/spec/05-cost-guardrails.md` documents the alert hook contract.
5. **Sample** — Caldwell `model.md` shows commented-out Telegram webhook example.

## Acceptance

- `model.md` parses `alert_hooks` correctly (with and without — backward compatible)
- Webhook POST happens at threshold cross + at cap-blocked event, payload schema is documented
- Webhook failure does NOT block the agent run
- Programmatic `on_cost_alert` hook fires for both events
- Tests cover: hook fires once per threshold per day (not on every run), hook failure is logged, webhook timeout doesn't hang the run
- New JSONL fields `cost_alert_dispatched: true/false` for audit

## Open questions

- Telegram webhook needs `chat_id` per operator — is that in `alert_hooks.url` (URL has chat_id baked in) or a separate field? (URL probably; standard Telegram pattern)
- Per-agent vs per-deployment alert config: today guardrails are per-agent. Hooks probably want to be per-agent too (different agents → different routing) but with a global default at deployment level (`atomic-agents.toml` or env). Defer global default until needed.
- Webhook retry policy: probably "best effort, don't retry, don't block" — alerter is responsible for not dropping. Document.

## Context

- Surfaced in deployment-readiness review (2026-05-08), gap E
- Telegram alerting is non-negotiable for Dan's gizmo deployment
- Pattern reference: similar webhook/callback hooks would be useful for other framework events (run_failed, dream_completed, eval_failed) — track here as future-but-not-this-PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[deployment] Cost guardrail threshold-cross alert hook (webhook / callback) #70

Background

Why it matters

What to change

Acceptance

Open questions

Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[deployment] Cost guardrail threshold-cross alert hook (webhook / callback) #70

Description

Background

Why it matters

What to change

Acceptance

Open questions

Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions