fix(phone/voice): stop spammy Allow prompts AND signature-loop kills#59
Merged
Conversation
…" prompts PR #58 introduced 8 typed Phone/Voice capabilities (ListPhoneNumbers, PhoneLookup, PhoneFraudCheck, VoiceStatus, BuyPhoneNumber, RenewPhoneNumber, ReleasePhoneNumber, VoiceCall) but never wired them into the permissions system. They all fell through to the default fall-back branch in PermissionManager.check() and triggered an interactive "Allow?" prompt on every single call. Real-world impact today: agent placed one outbound call (correct, should ask), then polled VoiceStatus every few seconds while the call was running. Each VoiceStatus poll prompted the user — five+ prompts per minute during a 1-min call. Confirmed from a user session today (2026-05-18) showing 11 separate VoiceStatus tool_use entries against one call_id. Classification follows the same side-effect rule the rest of the permission system uses (READ_ONLY = "doesn't change the world outside the gateway", regardless of price): READ_ONLY (auto-allow): - ListPhoneNumbers — cached wallet inventory read ($0.001) - PhoneLookup — carrier + line-type query ($0.01) - PhoneFraudCheck — SIM-swap signals ($0.05) - VoiceStatus — free GET poll on existing call ASK (explicit user consent every call, matches Write/Edit/Bash): - VoiceCall — dials a real human, $0.54, can't be undone - BuyPhoneNumber — holds a Twilio number for 30 days, $5 - RenewPhoneNumber — extends a held number, $5 - ReleasePhoneNumber — permanently returns number to pool, irreversible Pricing is intentionally orthogonal — WebSearch and ImageGen also charge USDC but live in READ_ONLY because they don't dial anyone / permanently mutate gateway state. Does NOT touch the separate signature-loop-guard bug (VoiceStatus is a poll-style tool but the guard treats repeated identical inputs as a stuck loop — that's a follow-up PR that needs a polling-tool whitelist in src/agent/loop.ts). Test plan: npm test → 405/405 pass (no permission test references this set directly; classification is a pure additive change) Manual: start a session, call VoiceCall, observe one prompt; agent then polls VoiceStatus 5x with no further prompts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Companion fix to the permissions classification in the same PR. Without
this, even after we stop spamming "Allow?" prompts, the agent still
trips Franklin's signature-loop guard the moment it has to wait for a
real call to finish.
Background: VoiceStatus ships as a naked one-shot GET — agent calls it,
gets `in-progress`, has to decide to call it again, gets `in-progress`,
again, again... Same call_id every time, so the input signature is
literally identical. `loop.ts:turnSignatureCounts` triggers at 5 and
kills the turn with "Loop stopped: ... repeated the same input 5×".
Confirmed in a user session (2026-05-18, screenshot shared): turn 1
fired 1 VoiceCall + 5 VoiceStatus polls, all hit while the call was
still ringing/in-progress, then died at the 5th. Call kept running in
the cloud; agent never saw the transcript on the original turn. User
had to ask "give me the transcript" in a second turn to recover.
The fix: mirror the pattern videogen.ts and imagegen.ts already use.
VoiceStatus blocks internally, polling every 5 s until a terminal
status (completed / failed / no-answer / busy / voicemail / cancelled)
or the 35-min ceiling. Agent emits exactly one VoiceStatus tool_use
per call and gets back the final transcript when the call ends.
Side benefits:
- Each poll iteration writes the latest snapshot to the local call log
(so the panel Calls tab updates live even though the agent is blocked
in the tool).
- ctx.abortSignal is honored — Ctrl-C cancels the poll cleanly.
- 35 min ceiling = 30 min Bland max_duration + 5 min headroom for the
upstream to settle / mark final status.
Updated tool description tells the model explicitly: "CALL THIS ONCE.
The tool blocks internally" — so even cheap models that pattern-match
on description (deepseek, haiku) won't try to poll in a loop.
Test plan:
npm test → 405/405 pass (same baseline as the permissions change;
no test references VoiceStatus directly)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #58 added 8 typed Phone/Voice tools but missed two integration points. Both surface as agent-killing bugs the moment a real user (especially on cheap models routed by clawrouter) tries to make a call. This PR fixes both in one go so reviewer sees the full picture.
Bug 1 — "Allow?" prompt spam
PR #58 never wired the 8 tools into `PermissionManager`. They all fell through to `decide()`'s default branch and triggered an interactive "Allow?" prompt on every single call — including the side-effect-free polling tool VoiceStatus. Users saw 5+ "Allow VoiceStatus?" prompts per minute during a single call.
Bug 2 — Signature-loop guard kills polling turns
Even with permissions fixed, VoiceStatus shipped as a naked one-shot GET. The agent has to drive the poll cadence itself, calling VoiceStatus(call_id=X) repeatedly until status flips to terminal. The signature counter — `turnSignatureCounts.get('VoiceStatus:{call_id:"X"}')` — climbs by one per poll. At 5, `loop.ts` kills the turn with `Loop stopped: ... repeated the same input 5×`. Confirmed in a user session 2026-05-18: turn 1 fired 1 VoiceCall + 5 VoiceStatus before being killed; the call kept running upstream but the agent never saw the transcript.
Fix 1 — Classify Phone & Voice tools in permissions.ts
Splits the 8 tools by side-effect (matching the rule the rest of the permission system uses — READ_ONLY = "doesn't change the world outside the gateway", price is orthogonal):
`WebSearch` and `ImageGen` also charge USDC but live in READ_ONLY because they don't dial anyone or permanently mutate gateway state. Same logic here.
Fix 2 — VoiceStatus internal poll-until-terminal
Refactor VoiceStatus to mirror the pattern videogen.ts and imagegen.ts already use:
Agent emits exactly one VoiceStatus tool_use per call and gets back the final transcript when the call ends. Signature counter stays at 1, guard never trips.
Test plan
Why both fixes belong in one PR
They're the same regression — "PR #58 shipped typed Phone/Voice tools but didn't wire them into the surrounding agent infrastructure." Reviewing them together makes the failure mode obvious. Splitting them would leave the agent half-broken on whichever side merged first.
🤖 Generated with Claude Code