blog: Reserving Authority When You Can't Pause by amavashev · Pull Request #651 · runcycles/cycles-docs

amavashev · 2026-05-15T17:14:16Z

Summary

First post in a new pillar: voice / realtime agent governance. Net-new surface for the corpus.

The post identifies the structural constraint: reserve-commit assumes the agent can wait synchronously for ALLOW, but voice agents can't — a 100ms sync gate on each audio frame would push the conversation past the ~700ms natural-feel threshold. The fix is not to abandon the gate; it is to position it where the latency budget can absorb it.

Four patterns covered:

Predictive reservation, true-up later — reserve N minutes upfront, commit actual at call end
Tier-aware gating — sync gate on slow-path tool calls, predictive reservation on fast-path audio
Time-bounded floor authority — per-second auto-replenish for very high-throughput deployments
Speculative commit with deny window — slow-path only, since audio is unrecoverable

Voice-specific failure modes (talking-to-itself loops, stuck conversations, premium-tier escalation runaway, hold-music wall-clock blind spots) get their own table. Mirrors the PocketOS two-layer fix: per-call provider caps + agent-side runtime authority.

Stack matrix for OpenAI Realtime / Vapi / Retell AI / ElevenLabs shows where the gate can sit in each.

Author: Albert Mavashev
Date: 2026-05-20
Word count: ~3,100 body

Reviews

Internal cycles 1–3 (scorecard 9.3/10)
Glossary auto-linker applied 7 contextual links
Codex external review: round 1 REVISE-MINOR (9 findings, 9 applied / 2 pushed back), round 2 SHIP

Codex verified upstream:

OpenAI Realtime API event surface (response.function_call_arguments.delta/.done are the actual events — my original `response.function_call` was wrong; fixed)
ElevenLabs / Vapi / Retell AI pricing pages
Cycles-docs internal targets

Cycle 1 fact-check caught and fixed:

OpenAI Realtime latency attribution softened (the OpenAI page is bot-blocked from WebFetch; reframed as industry observation with turn-taking research convergence)
Retell AI pricing range widened from $0.07-$0.15 to $0.07-$0.31 (original ceiling was understated)
Opener cost figure corrected ($390 in 17 minutes was implausible given the post's own pricing table; lowered to $90)
ElevenLabs $0.24/min ceiling clarified as requiring LLM + telephony at cost on top of burst hosting
Vapi $0.115-$0.42/min labeled as derived estimate
Provider-layer cap claim softened (pricing pages don't uniformly establish hard caps)

Per-dimension scores

Dimension	Score
Factual accuracy	9.5
Credibility	9
Cross-links	9
SEO (title 40/51, desc 152/160)	9.5
Code accuracy	9
Structure & flow	9.5
Terminology	9.5
Tone & style	9.5

Overall: 9.3 / 10

Test plan

`npm run dev` and verify post renders at `/blog/voice-agent-budgets-when-you-cant-pause-to-reserve`
Verify post appears on `/blog/` index sorted to top (date 2026-05-20)
Click through internal links and confirm they resolve (sibling links to memory-writes, merge, computer-use depend on PRs blog: Agent Memory Writes Are Actions, Too #648 / blog: When Coding Agents Press Merge #649 / blog: Computer-Use Agents Have No Tool Boundary #650 being merged first)
Confirm date/author/tags/reading-time header renders above body
Confirm Prev/Next post navigation works
`npm run build` succeeds with no broken-link warnings

Dependencies and order

This post links to three sibling posts that are on PR branches awaiting merge: `/blog/agent-memory-writes-are-actions-too` (PR #648), `/blog/when-coding-agents-press-merge` (PR #649), `/blog/computer-use-agents-have-no-tool-boundary` (PR #650). Merge order: #648 → #649 → #650 → this PR so the trilogy + this voice post all land with working cross-links.

Tags

Three new pillar tags introduced for the voice surface: `voice-agents`, `realtime`, `latency`. Future voice posts should align on these. Other tags (`budgets`, `runtime-authority`, `agents`, `engineering`, `RISK_POINTS`) match corpus convention.

New pillar post on voice / realtime agent budgets. Net-new surface for the corpus — first post in the voice-agents pillar. The post identifies the structural constraint: reserve-commit assumes the agent can wait synchronously for ALLOW. Voice and realtime agents can't — a 100ms sync gate on each audio frame would push the conversation past the ~700ms natural-feel threshold. The fix is not to abandon the gate; it is to position it where the latency budget can absorb it. Four patterns covered: 1. Predictive reservation, true-up later (reserve N minutes upfront, commit actual at call end) 2. Tier-aware gating (sync gate on the slow-path tool calls, predictive reservation on the fast-path audio) 3. Time-bounded floor authority (per-second auto-replenish for very high-throughput deployments) 4. Speculative commit with deny window (slow-path only, since audio is unrecoverable) Stack matrix for OpenAI Realtime / Vapi / Retell AI / ElevenLabs shows where the gate can sit in each. Voice-specific failure modes (talking- to-itself loops, stuck conversations, premium-tier escalation runaway, hold-music wall-clock blind spots) get their own table. Mirrors the PocketOS two-layer fix: per-call provider caps + agent-side runtime authority. Internal cross-links to tracking-tokens-in-a-streaming-llm-response (closest sibling), estimate-drift (calibration), ai-agent-action- control (parent tier model), retry-storms-and-idempotency, when-budget- runs-out, multi-tenant cost control, plus the just-shipped trilogy (memory-writes, merge, computer-use). External citations: OpenAI Realtime delivery scale page, callsphere pricing analysis, implicit acknowledgment of ElevenLabs/Vapi/Retell pricing models. Reviews: internal cycles 1-3 (scorecard 9.3/10), glossary linker added 7 contextual links. Cycle 1 fact-check caught and fixed: - OpenAI Realtime latency attribution softened (the OpenAI page is bot-blocked from WebFetch, so "OpenAI targets X ms" became "production deployments typically land at X ms with industry guidance converging on ~700ms"). - Retell AI pricing range widened from $0.07-$0.15 to $0.07-$0.31 (the original ceiling was understated). - Opener cost figure corrected ($390 in 17 minutes was implausible given the post's own pricing table; lowered to $90). - Retell AI capitalization standardized ("Retell AI" everywhere brand-list contexts, not bare "Retell"). Three new pillar tags introduced: voice-agents, realtime, latency.

…-pause-to-reserve Apply/skip tally: 9 applied, 2 pushed back. Applied: - `response.function_call` → `response.function_call_arguments.*`: the OpenAI Realtime API uses function-call output items and the function_call_arguments streaming events; my original event name was not a real Realtime server event. Fixed in both the prose and the stack-by-stack table. - 80-150 ms relay hop: removed the specific band attribution. The OpenAI page does not state it. Generic phrasing: "a forwarding hop sized to fit inside the conversation's latency budget." - ElevenLabs row: clarified the $0.08-$0.24/min framing. Hosting is $0.08/min flat or $0.16/min burst; the $0.24 ceiling derives once LLM and telephony layer on at cost. - Vapi row: labeled the $0.115-$0.42/min range as an estimate (it's derived from $0.05/min orchestration plus a BYOK provider stack at cost; the actual all-in depends on provider choices). - 17-minute "$1.50-$8.00 model spend alone": tightened to "against the per-minute stack rates above" since the rates in the table mix all-in / provider / orchestration models. - Provider-layer caps: softened from "OpenAI, Vapi, Retell AI, and ElevenLabs all expose per-call or per-session limits" to "to whatever degree each provider exposes them — typically through per-session budget headers, dashboard caps, or programmatic limits." Pricing pages don't uniformly establish hard caps. - "Most production voice teams use this only..." for speculative commit: softened to "This pattern is usually safer on the slow-path tool layer." - Description trimmed 162 → 152 chars: changed "—" to ":", "sit synchronously in the path" to "sync on the hot path." - `reserve-commit` glossary link: pointed to /protocol/how-reserve- commit-works-in-cycles instead of /glossary#reservation (reserve-commit is a lifecycle term, not the reservation entry). Skipped, with reason: - Body cross-link count (11) above 5-8 pillar target: three of the eleven are the trilogy references in a single closing sentence that names the sibling extension series (memory-writes, merge, computer-use). They are coherent as a triple, not redundant. - 2026-05-20 publish date: intentional sequence after the trilogy (5/16, 5/18, 5/19, 5/20). Codex verified upstream: ElevenLabs/Vapi/Retell AI pricing pages, OpenAI Realtime API event surface (function_call_arguments.delta / .done are the actual streaming events), and the cycles-docs main- branch internal targets. Sibling links to memory-writes, merge, and computer-use treated as just-merged via PR #648-#650.

…o 2026-06-06 Moved from 2026-05-20 to 2026-06-06 to match the weekly publishing cadence for the action-authority extension arc. Sequence: memory 5/16, merge 5/23, computer-use 5/30, voice 6/06.

amavashev added 2 commits May 15, 2026 13:00

This was referenced May 15, 2026

blog: What Four New Surfaces Taught Us #652

Open

blog: Rolling Out Action Authority on New Surfaces #653

Open

blog: reschedule voice-agent-budgets-when-you-cant-pause-to-reserve t…

5c3ecab

…o 2026-06-06 Moved from 2026-05-20 to 2026-06-06 to match the weekly publishing cadence for the action-authority extension arc. Sequence: memory 5/16, merge 5/23, computer-use 5/30, voice 6/06.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blog: Reserving Authority When You Can't Pause#651

blog: Reserving Authority When You Can't Pause#651
amavashev wants to merge 3 commits into
mainfrom
blog/voice-agent-budgets-when-you-cant-pause-to-reserve

amavashev commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amavashev commented May 15, 2026

Summary

Reviews

Per-dimension scores

Test plan

Dependencies and order

Tags

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant